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(57) Abstract 

A method of generating 
a bit stream by muldplexing 
non-compressed auxiliary 
information with an infonnadon 
stream. The information stream is 
obtained by compressing fixed-size 
units of an infoTmadon signal 
with a varying compression ratio 
to provide var>ing-sized units 
of the information stream. The 
auxiliary information is for use 
in subsequendy -decoding the 
information stream. Units of the 
auxiliary informadon correspond to 
the units of the informadon signal. 
In the method, the information 
stream is divided in time into 
information stream pornons. 
The non-compressed auxiliary 
informadon is also divided in time 
into auxiliary informadon pordons. 
The informadon stream pordons 

and the auxiliary information _ 

pordons are interleaved to provide 
the bit stream. Finally, the 

informadon stream dividing, auxiliary informadon dividing, and mierlcaving steps are controlled by emulating decoding of the bit stream 
by hypothedcal system target decoder. The hypothetical system target decoder includes a dcmuldplexcr that demultiplexes the bit stream, 
a serial anangemcnt of an informadon stream buffer and an informadon sttcam decoder, and a serial arrangement of an auxiliary 
information buffer and an auxiliary information processor. Each serial arrangement is connected to the demultiplexer. The infortnanon 
stream dividing, auxiliary information dividing, and intcrieaving steps are controlled such that the information stream buffer and the 
auxiliary information buffer neither overflow nor underflow. 
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DESCRIPTION 

RATIONAL INPUT BUFFER ARRANGEMENTS FOR AUXILIARY 
INFORMATION IN VIDEO AND AUDIO SIGNAL PROCESSING SYSTEMS 

5 

Technical Field 

The present invention relates to apparatus for 
compressing and expanding digital information signals, 
and/ in particular, to the buffering of auxiliary 
10 information included with information signals compressed 

with a dynamically varying compression ratio. 

Background Art 

For storage on or distribution via such media as 
15 CD-ROMs, laser disks (IjDs), video tapes, magneto-optical 

(MO) storage media, digital compact cassette (DCC), 
terrestrial or satellite broadcasting, cable systems, 
fibre-optic distribution systems, telephone systems, ISDN 
systems etc . , video and audio signals are compressed and 
20 coded, and the resulting video stream and audio stream are 

then multiplexed to provide a bit stream for feeding to 
the medium. The bit stream is later reproduced from the 
medium, is demultiplexed, and the resulting video stream 
and audio stream are decoded and expanded to recover the 
25 original audio and video signals. 
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a...ce3 ..e ..«..a.ona. «a^- 0.a..a. XSO, 

V, are estaJDlisbed under tHe 
The MPEG standards are eetc^ 

• n k^. tiaed in a wide range of 
assumption that they wxll be used 

. .-ons AS a result, t^e standards allow for sucb 
applications. AS a I. « 

La ex.. o. a..o a.^X P--o.e. « a^ 
....«.«e.ce.c.,aa«a.a.e««c.c=.o...e,..eo 

au»x. an. a ncn P^ae-XoCea ava.e. «^=. 

exoc. o. ..e v..eo a.s.e» oparate .n.epenaan.Xv. 

„.eepec.X,e ^^^^ - ^ 

of a time stamp to tne 
J, tlve addition or a 

MPEG standards require tne 

™ «t least once every 0.7 seconds, 
multiplexed bit stream at least 

• oo«arate time stamps for use 
and tbat tbe encoder provide separate 

..e audio decoder and by tbe video decoder. 

Oue Of tbe aims of tbe MPEG standards is to provide 

.>,.iitv for encoder and decoder design while 
xnaximum flexibility for e 

. .^at tbe bit stream provided by any encoder can 
ensuring tbat tne ci 

, ^ ,«ded by any decoder. One of tbe ways 
5 be successfully decoded by any 
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in wtiicli ttiis compatibility is establistied is by ttie 
concept of tlie System Target Decoder. 

A typical audio and video signal processing system 110 
according to the MPEG-1 and MPEG- 2 standards is shown in 
5 Figure 1. In this, the encoder 100 receives the video 

signal S2 from the video signal storage medium 2, and 
receives the audio signal S3 from the audio signal storage 
medium 3. The audio signal S3 could alternatively be (and 
is more usually) also received from the video signal 
10 storage medium 2 instead of from a separate audio storage 

medium . 

The encoder 100 compresses and codes the video and 
audio signals, and multiplexes the resulting audio stream 
and video stream to provide the multiplexed bit stream 

15 SlOO, which is fed for storage or distribution by the 

medium 5 • The mediiim can be any mediiim suitable for 
storing or distributing a digital bit stream, for example, 
a CD-ROM, a laser disk (LD) , a video tape, a 
magneto-optical (MO) storage medium, a digital compact 

20 cassette (DCC), a terrestrial or satellite broadcasting 

system, a cable system, a fibre-optic distribution system, 
a telephone system, an ISDN system, etc. 

The encoder 100 compresses and codes the video signal 
picture-by-picture. Each picture of the video signal is 

25 compressed in one of three compression modes. A picture 
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«r,-ression mode ls called 
^ in tlxe intra-picture compressio 

o r^f the video signal. PiC-cuj. 

5 inter-picture cororessxon 

^ P-picture is cc^ressed us.n3 for^rd 

B-p.ct«es. .PP ^ picture a previous 

::::oJo.e3-pi«ure.^ 

in tue viaeo Bisnal. Eaon , ^lock ot a 

=e bloox any one ct the following, a block 
iO a reference block ^ 

— ----- " 'p .ure occurri. later i. 

p-pioture or I-pi=ture (..e., a px 

„r a block obtained by performing 

1, e previous X-Pi«ure or 
iinear processing on a block of P 
« P.pi=ture and block of a following X P-tu 

V „t a E-Picture may be compressed m 
.n addition, blocks Of a B p 3,,., .30 

tbe intra-picture compression mode. Typ 

,^ 1 a . 1024 bits) of tbe video stream are 
' 7S » of tbe video stream are 

.e<^ired for an X- picture. 75 » 

oT,fl 5 iCb of the video streanv 
« retired for a .-picture, and 5 * 

retired for a B-picture. 

digital video and audio processing syst 

■noludes tbe decoder eoo, «bicb receives as its input 
includes ta« « c The decoder 

signal the .its.rea.SS.ro. the .edxu^S. 

• . inverse to the multiplex:Lng 
25 performs demultiplexing averse 
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performed in the encoder 10 0, The decoder also applies 
decoding and expansion to the resulting audio stream and 
video stream using processing complementary to that 
performed t>y the encoder 10 0 to provide the recovered 
5 video signal 6A and the recovered audio signal 6B. The 

recovered video signal 6A and the recovered audio signal 
6B respectively closely match the video signal S2 and the 
audio signal S3 fed into the encoder 100. 

Figure 1 also shows the system target decoder (STD) 

10 400 which is used to define the processing performed by 

the encoder 100 and the decoder 600. In practical video 
and audio signal processing systems, the encoder seldom 
includes an actual system target decoder, but instead 
performs the encoding processing and multiplexing taking 

15 account of the system target decoder parameters • Also, in 
practical systems, the decoder is designed to have 
performance e<^alling or exceeding that of the system 
target decoder. These relationships between the system 
* target decoder and the encoder and the decoder are 

20 indicated in Figure 1 by the broken line labelled S4A 

interconnecting the system target decoder and the encoder, 
and the broken line labelled S4B intercoimecting the 
system target decoder and the decoder. 

The system target decoder 400 is also known as a 

25 hypothetical system target decoder, system reference 
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..coaer, ox re.e.ence .a=o.i„. process.n. sv3te». .ro, 
„o„ on «U1 - re.e«e. to a= a s.e.e. »..e. aeco^er. 

' sv«e» ta«et decoaers a.e aeti.ea in in.ernaUonaX 
..anaa.a specifications sue. as C=X,. H.... ana t.e »K.-i 
stanaara to p.oviae ^iaeXines .c. t.e desi^ers o. viaeo 
ana auaio encoaers ana aecoaers for tnese stanaaras. 

xn tbe H.EO-1 ayBte» stanaara, t^e system target 
aecoaer incXuaes a reference viaeo aecoaer ana a reference 
.naio aecoaer. Xn aaaition. tne syste. ta=,et aecoaer 
.nciuaes an input .uf fer for t.e reference viaeo aecoaer 
ana an input .uffer for t.e reference auaio aecoaer. T.e 
si,e of eac. input .uffer is aefinea in t.e stanaara. «.e 
stanaara also aef ines t.e operation of t.e two reference 
aecoaers. especialiy «it. regara to t.e way in ^c. t.ey 
„ remove t.e auaio strea. ana t.e viaeo stre,. fro. t.eir 
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respective buffers . 

T^e concept of t.e syste. target aecoaer proviaes 
co^ati^ility ^t«een encoaers ana aecoaers of aifferent 
•aesigns as foiio... .li -oaers are aesignea to proviae 
a .it strea. t^at can ^ succesefuliy aecoaea .y the 
system target aecoder. ana t.at aoes not cause t.e 

^^A*-Ar^r^ all decoders are 
-^^A^-Finw In addition, axj- 
overflow or underflow. 

aesignea to ^.e perfor^nce parameters t.at are eoual to 
or better t>^ t.ose aefinea for the system target 
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decoder. As a result, all sucti decoders will be capable 
of successfully decoding the bit stream produced by any of 
the encoders designed to produce a bit streaon capable of 
being decoded by the system target decoder. The bit 
stream produced for decoding by the system target decoder 
is called a "constraint system parcuneter stream" . 

The structure of the hypothetical system target 
decoder 400 shown in Figure 1 is as follows. The 
demultiplexer 401 notionally receives the bit stream SlOO 
from the encoder 100. The demultiplexer 401 demultiplexes 
the bit stream into a video stream and an audio stream. 
The video stream is fed to the video input buffer 402, the 
output of which is connected to the video decoder 405. 
The audio stream from the demultiplexer 401 is fed into 



connected to the audio decoder 4 06. In the example shown 
in Figure 1, the video input buffer 4 02 has a storage 
capacity of 46K bytes and the audio input buffer 403 has a 
storage capacity of 4K bytes, as specified by the MPEG-1 
standard. The video decoder 405 removes the video stream 
from the video input buffer 402 one video access unit at a 
time, i.e., one picture at time, at a timing corresponding 
to the picture rate of the video signal, e.g., once every 
1/29.94 seconds in an NTSC system. The amount of the 
video stream removed from the video input buffer for each 



15 



the audio input buffer 403, the output of which is 
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pict:ure varies because of tlxe different amount of 
compression applied to each picture. The audio decoder 
406 removes the audio stream from the audio input buffer 
403 one audio access unit at a predetermined timing. 

It is desirable from the standpoint of the 
construction of the system, and to maximize flexibility, 
that, in the real decoder 600, the element corresponding 
to the demultiplexer 401 in the STD include a switching 
circuit, and that the elements corresponding to the video 
decoder 405 and the audio decoder 406 in the STB be 
provided using a high-speed data processor (DSP) having a 
configuration suitable for performing high-speed signal 
processing operations. Such processors normally cannot 
include a large amount of storage for cost reasons. 
Therefore, the MPEG standards take these practical 
considerations into account and set the storage capacities 
of the video input buffer 402 and the audio input buffer 
403 to the relatively small values set forth above. 

Figure 2 shows the structure of the constraint 
parameter (multiplex) system bit stream CPSP that is 
notionally fed into the system target decoder 400. The 
bit stream shown in Figure 2 has a multi-layer structure, 
and includes various headers in a multiplex layer and the 
■ audio stream and the video stream in a signal layer. In 

this structure, plural packs serially arranged in time. 



_M30014A1JL> 



wo 94/30014 _ 

~ PCT/JP94/00942 



9 

Each pack begins with a pack header, and includes at least 
one video packet and at least one audio packet. Each 
video packet begins with a packet header and includes the 
video stream of at least part of at least one picture. One 
5 video packet will acconnnodate the video stream of more 
than one B- picture, but several video packets are 
required to accommodate the video streaim of one I-picture. 
There is no requirement that a picture begin immediately 
^fter the packet header: the picture may start at any 
10 point in the video packet • 

Each video packet header may include at least one 
video time stamp showing the presentation time of the 
first picture that begins in the packet. If the first 
picture is an I-picture or a P-picture, and its decoding 
15 time differs from its presentation time, a decoding time 

stamp may also be included. The purpose and use of the 
video time stamps will be described below. 

Each audio packet includes at least one audio access 
unit of the audio stream, and begins with an audio packet 
header. The audio packet header may include a 
presentation time stamp showing the output timing of the 
audio signal obtained by decoding the first audio access 
unit beginning in the audio packet. Each audio access 
xinit is aiout 384 bytes in MPEG-1. 

Figure 2 shows a video packet that includes the video 
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srteai. of tue e video tijns 

.east t.e >«.innin. o. t.e pxcur. ..X. 

_o...e«a.po..ep.c«.e-,.eca.e.a. 

« „r,^*- T and the audio 
...io Bi^l ene Of t.e access u..t 3. 

. :la«..c.uee....ea„«oPac.e..ea.e..s..e..e 

„ s.a.P o. ..e audio access u.i. ..X. — t.e access 

..,.is..e.«.accessu....a...ins..ea.iO 

"*le e.coa« .00 compresses and codes ^e video si^l 

,ideo s^ea. and an audio s„ea», respective.,, and 
„ui.ipie.es ..e audio s«es.. »e video s„ea., and^. 
^.icus .eaders .o provide U>e :.i.iPXe.ed .i. s«ea. SiOO 

« o The encoder feeds 
leaving the format Blaown in Figure 2. 

..e multiplexed .it stream to the medium 5 for 

«^«raae The multiplexed bit stream is 
20 transmission or storage. Th 

-hafl fed the multiplexed bit 
. ^>,-t if the encoder had fea tne 
such that. It ^ 

.„.^ „ ..e svste. tarae. decoder .00 .or decod.n. *e 

^♦:iovi or underflow 
s«e» success.uiXV. and no overflow 

occurred in either of rlie input buffers xn «e 
25 would have occurrea m 
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system target decoder, 

^ Because of the requirement ttiat the multiplexed bit 
stream SlOO be capable of being successfully decoded by 
the system target decoder 400, the encoder 10 0 applies a 
5 dynamically varying compression and coding processing to 

at least the video signal 32 , The compression ratio of 
the compression applied by the encoder 100 varies with 
time. Moreover/ since the amount of the video stream that 
can be used to represent a picture of the video signal S2 

10 depends on the occupancy of the video input buffer of the 

system target decoder at the instant that the picture is 
compressed, the amount of compression applied to a given 
picture varies dynamically. The amount of the video 
stream derived from a given video secjuence will differ if 

15 the given video seq[uence is processed on different 

occasions. Accordingly, the compression ratio of at least 
the video stream produced by the encoder 100 varies 
constantly. 

As shown above, the audio stream and the video stream 
20 are time multiplexed to provide the multiplexed bit stream 

SlOO. The audio stream of the audio signal belonging to.-a 
given picture of the video signal is located in the 
multiplexed bit stream some time earlier or later than the 
video stream of the picture. As a result of this, the 
25 decoder 600 must provide timing synchronization between 
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,er a« a.ove-»e«io.s. .i»e s.3.pa « at 
5 encoder add t ^^^^ 

, tl,e video packet Headers and the audx 
some of the v.d ^^^^^ 
readers, .he video t^e stamps and the a 
...t.i..ep.e.cri.i...ec.oc.sto.e.e..0P^^^^ 

of the video stream and the audio 
.„n..e. ae==a.n. o. ^^^^^ 

- ornr? tlie audio stream 
the video stream anfl tBe a 

. SUCH tuning InEotmation is 

« the decoder output. Such „ . „„rB 

uecessarv to prevent audic/video s:.=.xou..at 

if the decoder ia unahle to decode lost 
£.0. occurring .f the „m be 

corrupted audio or video acceas uu^ts. ^ 
.aescrihed in .nore detail heXow. 

^ ^4= rVie decoder oou- 
.igure 3 ahowa the structure of the 

, eOO the de^ltiPlexer 601 recexves the 
the decoder 600, tne « 

. .v,it stream from the medium 5. The 

Cr 1 a— - 
'' o tream. the Video ti.e stamps, the audio 

"^TtL liot-e stamps, Video time stamps 
25 stream, and the auaio 
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and the audio time stamps are respectively fed to tlie 
picture rate control circuit 698 and the sampling rate 
control circuit 699 for use in decoding the video stream 
and the -audio stream, respectively. The video stream from 
5 the output of the demultiplexer 601 is fed into the video 

input buffer 602, which precedes the video decoder 605. 
The audio stream from the demultiplexer is fed into the 
audio input buffer 603, which precedes the audio decoder 
606. 

10 The video decoder 605 removes each access unit of the 

video stream from the video input buffer 602 for decoding 
in the order in which the access unit was received by the 
video input buffer. The video decoder 605 decodes the 
video stream removed from the video input buffer 602 in 
15 response to timing signals received from the picture rate 

control circuit 698. The picture rate control circuit is, 
in turn, controlled by the time stamps fed from the 
demultiplexer 601. Similarly, the audio decoder 606 
•removes each access unit of the audio stream from the 
20 audio input buffer 603 for decoding in the order in which 

the access unit was received by the audio input buffer . 
The audio decoder 606 decodes the audio stream removed 
from the audio input buffer 603 in response to timing 
signals received from the sampling rate control circuit 
25 €99. The sampling rate controller is, in turn, controlled 
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. entering t^e aecoderB must be .uf fere. £or t.e 
etre^unB entering „,,,lonea 
,ol.ovin. reason.. ..e first reason .B c.a.. 

..ove. tne compression ratios oonstantiv c>-- - 

•„nut rate of the elementary streams to its 
average input rate o 

. aeooder depending on =lo=l= error. TUe tUird 
.espective decoder. ,,,, „ceive access units 

.eason is t^t tUe decoders nor^UV recei 

r,»ctive streams intermittently, so that the 
of their respective snre<i" 

. transfer rate of the elementary stream m 
instantaneous transfer ra 

. , ^ ^, , etream S5 from the medium 5 and the 
the multiplexed bit stream 

..stantaneous input rate of the elementary stream to its 
.espective decoder do not match, -erefore. the input 

^ v>=tween the demultiplexer 
buffers 602 and 603 are provided between 

.espectively. to a«ust the differences in the average 
instantaneous transfer rate and the instantaneous input 
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Showing the ti-e dependency Of the transfer Of the au.o 
..reamln the multiplexed Signal from the mediums into 



wo 94/30014 




PCT/JP94/00942 




15 



10 



15 



20 



ttie audio input buffer 60 3 and the input of the audio 
stream into the audio decoder 6 06 from the audio input 



buffer. The arrangement of the audio input buffer 603 and 
the audio decoder 606 is shown in Figure 4A. 

The bit index curves show the relationship between 
the total niimber of bits (shown on the y-ajcis) that pass a 
given point in the circuit at the time indicated on the 



Figure 4B shows the average bit index at the point lA 
at the input of the audio input buffer 603, which reflects 
the average rate at which the audio stream is transferred 
from the medixim. The curve shows that the average 
transfer rate of the audio stream from the medium is more 
or less constant. However, the curve is not a straight 
line because the transfer rate varies with time due to 
clock drift. 

Figure 4C shows the actual bit index at the point IV 
at the input to the input buffer 607. No bits are fed into 
the audio input buffer at first, because the multiplexer 
is feeding the video stream into the video buffer. Then, 
the demultiplexer 601 encounters the first audio packet in 
the multiplexed bit stream, and feeds the audio access 
units contained therein into the audio input buffer 603- 
Following the first audio packet, the demultiplexer ceases 
transfer of the audio stream into the audio input buffer 



x*axis . 
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during the ti^e it feeds tlae contents of the next video 
packet (B) into the video input buffer. Then, the 
demultiplexer 601 encounters another audio packet in the 
multiplexed bit stream and feeds the audio access units 
contained therein into the audio input buffer. This 
process is repeated throughout the decoding process. 

Figure 4D shows the bit index at the point OA at the 
output of the audio input buffer 603 as the audio stream 
is removed from the audio input buffer by the audio 
decoder 606. The audio decoder removes the audio stream 
from the audio input buffer one access unit at a time. 
Kemoval of the access unit takes place instantaneously, 
once every 24 ms, for example. 

When each picture of the video signal is compressed 
and subject to variable length coding in the encoder 100, 
the amount of video stream produced changes significantly 
from picture-to-picture, depending on the mode in which 
the video signal, of the picture was compressed, as 
described above. Accordingly, the input rate at which the 
video decoder 605 removes the video stream from the video 
input buffer 602 also changes significantly from picture 
to picture. AS a result, the storage capacity of the 
video input buffer 602 is re<^ired to be considerably 
larger than the storage capacity of the audio input buffer 
603. For example, the MPEG-1 standard requires that the 
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size, i.e.^ the storage capacity/ of the video input 
buffer 6 02 be 4 6K bytes, whereas the standard sets the 
size of the audio input buffer at only 4K bytes. 

Figures 5A to 5D include three bit index curves 
5 showing the time dependency of the transfer of the video 

stream in the multiplexed signal from the medium 5 into 
the video input buffer 602 and the input of the video 
stream into the video decoder 605 from the video input 
buffer. The arrangement of the video input buffer 602 and 

10 the video decoder 605 is shown in Figure 5A. 

Figure 5B shows the average bit index at the point IV 
at the input of the video input buffer 602, which reflects 
the average rate at which the video stream is transferred 
from the medium* The curve shows that the average 

15 transfer rate of the video stream from the mediim is more 

or less constant. However, the curve is not a straight 
line because the transfer rate varies gradually with time 
due to clock drift. 

Figure 5C shows the actual bit index at the point IV 

2 0 at the input to the video input buffer 602. The video 

stream is first fed into the video input buffer at a 
substantially constant rate until the demultiplexer 601 
encounters the first audio packet in the multiplexed bit 
stream. The multiplexer interrupts feeding the video 

25 stream into the video input buffer while it feeds the 
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n ^ A-n^r^ the audlo input buffer 
contents of the audio paclcet .nto the 

eo3. Durin. t.is interruption, the .it index regains. 
Jchan.ed. t.e end of the first audio pac.et. t.e 
de.uxtipxe.er .OX de.uXtipXe.e3 the video packet .eader of 

.^«o packet, and then resumes transferring 
the foXXowing video pacKet, <i 

^T,to the video input buffer untiX it 
the video stream into the via 

e„e... -^s process is repea.e. ^ou,.out ... ee=o.in. 
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process . 

Shows «e .i. inae. at the poi« OV a. «e 
.s re^ve. *e ,i.eo input .uKer ^ the vi^c 

.rc. the video input buffer one access unit. i.e.. one 
picture, at a ti^. Re^-l of the access unit ta.es 
p..=e instantaneously, once every picture period, e.... 
once every 33.4 .s in an K.SC syste.. The a^unt o. the 
Video strea. removed each ti^e depends on the mode in 
Which the Picture was co».ressed ^ the encoder, ri^re 
Shows an e:c»ple in which a se^ence B-pictures is 

^- •« lowed by a sequence 
followed by an I-plcture. which .b followed y 

3-pictures. It can he seen that a .uch greater amount 

one I-picture than for one B-pi=ture. 

Pi^nes ex and 6B show the huf ferina provided hy the 
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video input buffer 602 or the audio input buffer 603. In 
these Figures, the video input buffer 602 is used as an 
example. The figures are both bit index curves. Figure 
6A shows ideal buffering, in which the video input buffer 
5 602 is used simply to accoirmodate the differences between 

the transfer rate of the video stream from the medium and 
the input rate of the video steam to the video decoder 
605. The video stream is fed into the video input buffer 
602 from the multiplexer 601 at a substantially constant 

10 transfer rate, as indicated by the straight line marked IS 

in Figure 6A. The video decoder removes the video stream 
from the video input buffer one access unit, i.e., one 
picture, at a time, as shown. The amount of video stream 
removed for any one picture can vary from about 150 Kbits 

15 for an I-picture to aJDOut 5 kbits for a B-picture. Thus, 

the video stream bit index at the output of the video 
input buffer changes in steps, the step size of which 
depends on the number of bits used to encode each picture, 
as indicated by the stepped curve marked OS. 

2 0 In the ideal buffering illustrated in Figure 6A, both 

of the following conditions are met at all times: 

(a) the difference between the amount of the video 
stream transferred into the video input buffer 602 from 
the medium and the storage capacity of the video input 

25 buffer 602 (indicated by the broken line SC) , does not 
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e,=eea ..e a»ou« of t.e viaeo removed »c» ..e 

,..eo .npu. .Uffex ..e video .ecoae., i.e.. ..exe is no 

overflow; and 

tte amount o£ t^e video Btre» removed from the 
Video input .uffer SO. the video decoder 60S does not 
exceed the amount of the video stream transferred into t.e 
video input suffer fro. the medium, i.e., there is no 
underflow. 

However, as illustrated in .i^re 6B an overflow or an 
^derfiow can sometimes occur in .ufferin.. Xn .i^re 6B 
the transfer rate at which the video stream is received 

otherwise similar to that shown in .i^re 6X. Initially, 
the video input .uf fer 60. receives an excess amount of 
video stream compared with that retired hy the video 

overflows at the point indicated ^ the letter A. Later, 
the transfer rate of the video stream received by the 
.Video input suffer falls .elow the demand of the video 
decoder for the video stream, with the result that the 
video input .uf fer underflows at point indicated hy the , 
letter B- 

By controlling various ones of the parameters . 
involved, input huf fer overflow or underflow can be 
prevented. Some ways of preventing overflow or underflow 
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are illustrated in the bit index curves shown in Figures 
7 A through 7C* 

The first method illustrated in Figure 7A is called 
the medium slave method. In this method, the amount of 
the video stream transferred from the medium 5 to the 
video input buffer 602 is controlled to prevent an 
overflow or underflow from occurring. Without such 
control, the transfer rate is indicated by the curve LI. 
With control, the transfer rate is that indicated by the 
curve -lil' . The amount of the video stream transferred 
from the medium is controlled so that the following two 
conditions are satisfied: 

(a) the difference between the amount of the video 
stream (indicated by curve LI') transferred into the video 
15 input buffer 602 from the medium and the storage capacity 

of the video input buffer does not exceed the amount of 
the video stream (indicated by the curve L3) removed from 
the video input buffer by the video decoder 605, i.e., 
there is no overflow; and 
20 (b) the amount of the video stream (indicated by the 

curve L3) removed by the video decoder 605 from the videp 
input buffer 602 does not exceed the amount of the video 
stream (indicated by the curve LI') transferred into the 
video input buffer 602, i.e., there is no underflow. 
25 The curve L2 shows how controlling the amount of the 
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...ec .«ns.e„ea -aeo .npu. .u..e. eo. 

..o. ..e .e.i>» co.«o.s t.e a.«e.en=e between t.e a.oun. 

t.e Video strea. transferred into t.e video input 

^„er and t.e storage capacity of t.e video input .uffer. 

curve .2- ..o„B t.ie difference „.en t.e amount of t.e 

,-ranBferred into the video input buffer from 
video stream translerrea 

the medium is not controlled. 

^e second method illustrated in .igure 7B is called 
the decoder slave method. Xn this method, the picture 
.ate of the video decoder is controlled to change the 
^ount of the Video stream removed from the video input 
buffer by the video decoder. The picture rate x. 
controlled such that the following two conditions are hoth 
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25 



the amount of video stream (indicated hy the curve 
which is the difference between the amount of the 
Video stream .indicated by the curve .1) fed into the 
video input buffer 602 and the storage capacity of the 
.video input buffer, does not exceed the amount of the 
Video stres. (indicated by the curve .SO removed from the 

no overflow; and 

the amount of the video stream (indicated by. the 

curve L3., removed from the video input buffer by the 
Video decoder does not exceed the amount of the video 
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stream (indicated by the curve LI) transferred into the 
video input buffer 602 from the medium, i.e., there is no 
underflow. 

The actual amounts of the video stream removed from 
the video input buffer by the video decoder are indicated 
by the curve L3 ' . 

The above explanation is made with reference to the 
video stream, but similar results can be obtained for the 
audio stream by changing the sampling rate of the audio 
decoder 606 to adjust the rate at which the audio stream 
is removed from the audio input buffer 603. 

The third method illustrated in Figure 7C adjusts the 
amount of the video stream removed from the video input 
buffer 603 by the video decoder 605. For example, the 
method may cause the video decoder to skip decoding 
portions of the video stream or to repeat decoding 
portions of the video stream to adjust the amount of the 
video stream removed from the video input buffer. 

The curve L3' shows the changes in the amount of the 
video stream removed from the video input buffer 602. To 
prevent an overflow from occurring early in the sequence/ 
the amount of the video stream removed from the video 
input buffer is increased by removing some video access 
iinits from the video input buffer but not decoding them. 
Later, to prevent an underflow, the amount of the video 
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re»ove^ fro» t^e input buffer is reduced by 

b:Ker an* aecoding t.e« .or. ..an once. T.i. provides 
.adi.io.aX Pictures wit.ou. re»ovin. video access units 
from the video input buffer. 

changing tbe picture rate of t.e video decoder, t^e 
ea^plin, rate of t.e audio decoder, or t.e transfer rate 
Of tie ^ltiple«d bit stream from tbe medium 5. as Just 
descried, causes undesirable side effects on t.e systems 
e«er.al to t^e video and audio si^al procesaina system 
1.0. Therefore, tbe cl>an,es Just des=rl,=ed cannot be made 
£,eely, and may only be made vitbin a limited ran,e. 
conse^ently, it is desirable to control tbe multiples* 
bit stream produced by tbe encoder so tbat t.e buffering 
retirements in tbe decoder can be met comfortably witbout 
bavin, to resort to tbe correction metbods Just described. 

„,Xfunctions in tbe buffering process are most ll.cely 
to occur at tbe start of decoding. ^ underflow will 
result if tbe decoder attempts to remove an access unit of 

.coess bas been transferred into tbe input buffer from tbe 
medium. To prevent tbis. tbe decoding processing is 
started only after certain delay time bas elapsed after 
transfer of tbe bit stream from tbe medium bas begun. 
Tbis allows tbe audio stream and tbe video stream to 
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accumulate in the respective audio and video input buffers 
before the respective decoders start removing units of the 
audio stream and the video stream for decoding. 

Figures 8A through 8D show some effects of a startup 
5 delay on buffering. Figure 8A shows ideal buffering, 

similar to that shown in Figure 6A. Figure SB shows the 
beneficial effect of a suitable startup delay when the 
multiplexed bit stream is transferred from the medium at a 
varying transfer rate. In Figure SB, the startup delay 
10 allows additional video stream to acc\imulate in the video 

input buffer 602 before the video decoder 605 starts to 
remove access units of the video stream from the video 
input buffer. 

Care must be exercised in determining the optimum 
15 startup delay. Figure 8C shows the effect of an 

excessively long startup delay, in Figure 8C, the video 
decoder 605 waits too long before it starts to remove the 
video stream from the video input buffer 602. As a 
result, an overflow occurs at point C. Figure 8D shows 
2 0 the effect of a startup delay that is too short. The short 

startup delay does not allow sufficient video stream to. 
accumulate in the video input buffer before the video 
decoder starts to remove the video stream from the video 
input buffer for decoding. As a result, insufficient 
25 video stream has accumulated in the video input buffer 
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„.en .ue vi^eo aecc.e. .rie. « .e»ove video s..^ o. 
t^xe first X-Picture 12. and an underflow occurs at point 
D.' Figure 8D also shows that, with a suitable start-up 
delay, the video stream of the first I-Pi=ture 12 can be 
removed without causing an underflow. 

.igure 9 illustrates in detail how the multiplexed bit 
stream transferred from the medium 5 is processed by the 
demultiplexer 601. the video input buffer 602. and the 
video decoder 605 to decode the video stream in the 
multiplexed bit stream. The circuit arrangement of the 
multiplexer 601, the input buffer 603, and the video 
decoder 605 is shown at the top of the drawing. 

^ example of a portion of the multiplexed bit stream 
is Shown at the left side of the drawing. The portion of 
the demultiplexed bit stream includes all of the pac^ n. 
and the beginning part of the paclc n.l. Each pacK begins 
with the pad. header, which includes the cloc* reference 
SCR, which shows the decoding timing of the pack. 

The pack n begins with the pack header (Pack Header 
n) , and contains the video packet m, which, in turn, 
contains the video stream for the pictures i and i.l. The 
video packet m begins with the video packet header 
(V.Packet H), Which includes the presentation time stamp 
PTSm and the decoding time stamp DTSm. 

The pack n.l follows the pack n, and includes the 
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pack header (Pack Head n+1), whicli includes the clock 
reference SCRn+1. Following the pack header are the video 
packets m+l and m+2, and possibly more video packets. 
Each of the video packets m+l and in+2 includes a packet 
5 header including a decoding time stamp DTS, and the video 

stream of one picture. 

Figure 9 also shows the bit index curves for the input 
(marked IV) and the output (marked OV) of the video input 
buffer 602. Various events in the multiplexed bit stream 

10 are linked to the bit index curves with broken lines, and 

are also shown on the x-axis of the bit index curve. The 
bit index curve IV represents the bit index of the video 
stream transferred to the video input buffer 602 from the 
medium 5 via the demultiplexer 601. The bit index curve 

15 OV represents the bit index of the video stream removed 

from the video input buffer by the video decoder 605. 

The multiplexed bit stream is processed as follows: 
at the timing indicated by the clock reference SCRn in the 
•pack header of the pack n, the video stream contained in 

20 the pack n, i.e. / the video stream of the pictures i, and 

i+1, is transferred via the demultiplexer 601 to the video 
input buffer 602. Then, at the timing indicated by the 
clock reference SCRn+1 the video stream contained in the 
pack n+1 is transferred into the video input buffer 602 

25 via the demultiplexer 601. The time stamps in the video 
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packet heaaerB are Btored alsevhere. 

Later, at the td^e indicated by tixe decoding ti^e 
sla^p DTS. in t.e header of the video paCet a- the video 
atreaM o£ the picture m ie instantaneously removed fro» 
the video input bu«er 602 W the video decoder 605. 
Then, one picture period later, the video strea. of the 
picture i.l, »hich was also included in the video paCet 
„, is removed from the video input buffer by the video 
decoder, ^ter, at the tl^in, indicated by the deccdin. 
time stamp DTSm.l included in the packet header of the 
video packet m.l, the video stream of the picture i.2. 
Which is the first Picture beginning in the video packet 
..i, is removed from the video input buffer 602 by the 

video decoder 605. 

« the time indicated by the decoding time stamp 
:,TSm*2 in the packet header of the video packet m*2, the 
video stream of the picture i.3, which is the first 
Picture beginning in the video packet ».2, is removed from 
the video input buffer 602 by the video decoder 605. 
following removal of the video stream of the picture i.3. 
the video streams of the pictures whose video streams 
follow the Video stream of the picture i.3 in the video 
packet i.3, are removed from the video input buffer 602 at 
times that are increments of one picture period later than 
the time indicated by the decoding time stamp BTSm.2. 
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The timings indicated by tlie time stamps may be stored 
as absolute timings using, for example, a crystal 
oscillator and a reference clock of 90 kHz. In tliis way it 
is possible to use the difference between the clock 
5 reference and the time stamps as the start-up delay. 

As mentioned above, when a decoder according to the 
MPEG standard is used for decoding an audio stream and a 
video stream, it is necessary to synchronize the times at 
which units of the respective decoded signals resulting 
10 from decoding corresponding access units of the audio 

stream and the video stream are fed to the decoder output. 
The time at which a decoded signal unit is fed to the 
decoder output, is called the presentation time of that 
unit. The time stamps in the multiplexed bit stream are 
15 used to provide this synchronization. 

Part of providing the necessary synchronization 
includes reordering the video signal resulting from 
decoding the video stream. This is illustrated in Figure 
10* As mentioned above, the video stream includes the 
video streams of pictures that are compressed as 
I-pictures, as P-pictures, and as B-pictures, Of these ^. 
pictures, the decoding time and the presentation time are 
only the same for B-pictures. Incidentally, the decoding 
time and the presentation time are also the same for the 
audio stream. I-pictures and P-pictures have a 
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. .y..r is later by a number of picture 
presentation time that 3.s later 

tAi a^ the time indioacea by 
from tue video input buffer 602 at the 

..e decoding ti.e stamp .fte. tbe video stream of a 

Picture bas been decoded, tbe reau.tin, decoded video 
si^ax is te^orarily stored in tbe video decoder output 
.uffer 6Xi. «>en, at tbe time indicated by a presentation 
time sta^ PTS. tbe video si^al of tbe picture is fed 
.rom tbe video decoder output buffer to tbe output of tbe 
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signal. 

.or example, in .i^e 10. tbe video stream of tbe 

.be time indicated by tbe display time stamp =TSm for 
decoding, and tbe resulting video si^al is stored in tbe 
output buffer 611 provided in tbe video decoder 60S for 
temporarily storing tbe video signals of decoded 
I-pictures and P-pictures. 

^en. tbe video decoder 605 consecutively removes tbe 
,ieeo streams of tbe B-pictures BO and 31 from tbe video 
input buffer 60.. consecutively decodes tbem. and feeds 
tbe resulting video signals to its output one picture 
period apart • 

^« i-yy^ video stream 
Next, the video decoder 605 removes the 
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of the P-picture P5 from the video input buffer 602. The 
vid^o decoder instantaneously decodes the video stream,, 
and stores the resulting video signal in the output buffer 
611. Also, at the time indicated by the presentation time 
5 stamp PTS of the I-picture 12, which has the same value as 

the decoding time stamp of the P-picture P5, the video 
decoder feeds the video signal of the picture 12 to its 
output • 

Finally, in this example, the video decoder 605 

10 consecutively removes the video streams of the B-pictures 
B3 and B4 from the video input buffer 602, consecutively 
decodes them using the stored pictures 12 and P5 as 
reference pictures, and feeds the resulting video signals 
to its output one picture period apart. 

15 Since the video streams of I-pictures and. P-pictures 

differ- in their decoding timing and their presentation 
timing, a presentation time stamp and a decoding time 
stamp, respectively indicating the presentation time and 
the decoding time, are included in the video packet 

20 headers of the video packets in which the video streams of 

I-pictures or P-pictures begin. However, both types of 
time stamps need not be included, because, according to 
the MPEG decoding rules, the presentation time of each 
I-picture or P-picture is the same as the decoding time of 

25 the following I-picture or P-picture. In other words, the 
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• ^^^A and eacH I-picture or 
decoding ti^e stamps can omxtted, 
P.p.c.u.e can .e decoded a. ..e .i.e indicated .V t.e 
jesentaticn ti.e sta.p o. t.e previous X-pictu.e o. 
p-picture. 

„ a.. preee.«.io. .i^ee o. ..e «... v.ae= e....- 

«.o of the pictures from the 
decoder removes the video streams of 

. buffer in the order in which they were 
video input butter i" 

„on-se^e«i.l picture oraer. However, the 
presentation ti.e sta»s o. t.e pictures cause t.e 
,.«ures to ^ ai.pia,e. iu t.eir se^entia. or.er s^-n 
at the bottom of the figure. 

« s«.ea a^ve. t.e ti»e sta^s are inc.uae. in t.e 
.n^ipxe, ia.er o. t.e »uitipXe.e. .it strea.. an. not in 

T ..r^T This means that when the 
the audio or video stream layer. 

i*-.!r,-Lexed in the decoder, the 
^nultiplexed bit steam is demultiplexed 

. fte time stamps and the access units 
correlation between the time s 

. • -lost The decoder must therefore 
to which they pertain iS lost. 

*. innk the time stamps extracted from 
include a provision to imk the tim 

. .A bit stream with their respective access 
the multiplexed bit stream 

..«h is shown in Figures llA and IIB. 
units, one approach is snowu 

xn Figure lix. the decoder 600 includes the 

. . 601 which receives the multiplexed bit 

demultiplexer 601, wait. 
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stream from the medium 5. The demultiplexer demultiplexes 
the video stream and the video time stamps from the 
multiplexed bit stream and feeds these into the video 
stream reconfiguration unit 692. The demultiplexer also 
5 demultiplexes the audio stream and the audio time stamps 
from the multiplexed bit stream and feeds these into the 
audio stream reconfiguration unit 693. The output of the 
video stream reconfiguration unit is fed into the video 
input buffer 602, which precedes the video decoder 605. 

10 The decoding in the video decoder is controlled by the 

picture rate control circuit 698 in response to the video 
time stamps. The output of the audio stream 
reconfiguration unit 693 is fed into the audio input 
buffer 603, which precedes the audio decoder 606. 

15 Decoding in the audio decoder is controlled by the 

sampling rate control circuit 699 in response to the audio 
time stamps. 

The demultiplexer 601 receives the multiplexed bit 
. stream S5 from the medium 5 and separates it into the 
20 video stream, the video time stamps, the audio stream, and 
the audio time stamps. The video stream and the video 
time stamps are fed into the video stream reconfiguration 
xinit 692, which inserts the video time stamps into the 
video stream. For example, a video time stamp is inserted 
25 between the picture i, and the picture i+1 shown in Figure 
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IIB. The video stream, reconfigured as shown in Figure 
IIB, is fed to the video input buffer 602, where it is 
tlnporarily stored. The video decoder 605 removes the 
video stream, including the video time stamps, from the 
video input buffer 602 in the order in which it was 
received by the video input buffer. 

Xn a similar manner, the audio stream reconfiguration 
unit 693 receives the audio stream and the audio time 
stamps from the multiplexer 601 and inserts the audio time 
stamps into the audio stream. For example, an audio time 
stamp is inserted between the access unit i and the access 
unit j+1 of the audio stream shown in Figure IIB. The 
audio stream, reconfigured as shown in Figure IIB, is then 
fed from the audio stream reconfiguration unit to the 
audio input buffer 603, where it is temporarily stored. 
The audio decoder 606 removes the audio stream, including 
the audio time stamps, from the audio input buffer in the 
order in which it was received by the audio input buffer. 

The video decoder 605 decodes the video stream removed 
from the video input buffer 602 in response to timing 
signals received ' from the picture rate control circuit 
698. The picture rate control circuit is, in turn, 
controlled by the time stamps fed from the video decoder. 
Similarly, the audio decoder 606 decodes the audio stream 
removed from the audio input buffer 603 in response to 
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timing signals received from the sampling rate control 
circuit 699. The sampling rate controller is, in turn, 
controlled by the audio time stamps fed from the audio 
decoder. 

The decoder just described solves the problem of 
correlating the time stamps included in the multiplex 
layer with the video and audio access units to which they 
belong. However, embedding the time stamps into the audio 
and video streams results in streams that are no longer 
standard. A decoder that is suitable for decoding, for 
example, a video stream with embedded time stamps would be 
unsuitable for decoding a video stream in an application 
in which time stamps are not used, it is therefore 
preferable to correlate the time stamps with the access 
units to which they belong in a way that does not result in 
a non-standard stream and a non-standard decoder. 

Recently, the MPEG standards have permitted packets of 
information other than an audio stream or a video stream 
to be included in the multiplexed bit stream. For 
example, packets of directory information may be added to 
the bit stream. Directory information allows pictures to 
be displayed during fast forward operations by providing 
the address of successive access points in the multiplexed 
bit stream. An access point is a access unit can be 
decoded without requiring that another access unit be 
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decoder. For example, a video access point is a picture 
that is wholly or partially coded using intra-picture . 
coding. An access point is normally located at the 
beginning of each Group of Pictures. 
5 The MPEG standards stipulate that the packets 

containing directory information (directory packets) be 
interleaved with the audio packets and the video packets 
in the multiplexed bit stream, and also stipulate that a 
directory information buffer be provided in the decoder. 
10 However, the MPEG standards define neither the size nor 
the operation of the directory buffer. Because of the 
memory constraints in processors used in MPEG decoders, 
decoder designers allocate relatively little memory for 
buffering the directory information. Moreover, encoder 
15 designers have customarily made the directory packets 
relatively large, so that the directory packets occur 
relatively rarely in the multiplexed bit stream. 

The impact of the present relationship between the 
directory buffer size and the size and spacing of the 
20 directory packets on the fast-forward operation of a video 
tape recorder is shown in Figures 12A to 12E. Figure 12A 
shows the arrangement of part of the multiplexed bit 
stream as recorded on the video tape. The directory 
packet consists of the directory packet header 
25 (Dir.Pkt.Hdr) , followed by a set of directory entries, one 
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directory entry for each one of ttie following Groups of 
Pictures, Following the directory packet are plural video 
packets containing the video stream of the Groups of 
Pictures, Since, in this example, there are 20 Groups of 
5 Pictures following the directory packet, the directory 

packet includes 2 0 directory entries. In these figures, 
the audio packets interleaved with the video packets have 
been omitted to simplify the drawing. 

During the fast- forward operation, the directory 

10 packet header is recognized, and the contents of the 

directory packet are read from the tape, ajad transferred 
into the directory buffer, as shown in Figure 12B. 
However, since the directory buffer typically has a 
capacity of aiout 500 bits, and each directory typically 

15 requires about 100 bits, the directory buffer overflows 

after the first five directory entries have been stored. 

After the contents of the directory packet have been 
reproduced from the tape, the address of the beginning of 
the first Group of Pictures (GOP 0) is read from the 

20 directory buffer, and the tape is advanced to this address 

to enable the access point at the beginning of the first 
Group of Pictures to be reproduced from the tape, as shown 
in Figure 12C. While this picture is being decoded for 
display, the address of the beginning of the second Group 

25 of Pictures (GOP 1) is read from the directory buffer, and 
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the tape is advanced to this address to enable the access 
point, e.g., l-picture, at the beginning of the second 
Group of Pictures to be reproduced from the tape, also as 
shovm in Figure 12C. This process is repeated, as shown 
5 in Figure 12C up to the fifth Group of Pictures (GOP 4), 

after which the contents of the directory buffer are 
exhausted. 

Then, the tape has to be rewound back to the directory 
packet to reproduce the next five of the directory 

10 entries. These directory entries are stored in the 
directory buffer, as shown in Figure 12D. The tape 
recorder then uses these five new directory entries to 
fast forward through the pictures at the beginnings of the 
sixth through tenth Groups of Pictures (GOPs 5-9), as 

15 shown in Figure 12E. In all, the directory packet must be 
reproduced four times for the pictures at the beginning of 
each of the twenty Groups of Pictures GOP 0-GOP 19 to be 
reproduced. 

The mismatch between the directory buffer capacity, 
20 and the size and spacing of the directory packets makes 
the fast forward operation an extremely slow one if 
pictures are to be reproduced during the fast- forward 
operation, something that is routine during the fast 
forward operation in an analog video tape recorder. 
25 Using a larger directory buffer is not a complete 
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solution to the problem just described (althougb a larger 
buffer may reduce the seriousness of the problem) because 
the MPEG standards do not define the size and operation of 
the directory packet. Hence, no matter how large the 
directory buffer is made, the possibility of a directory 
packet larger than the directory buffer always exists. 

As an alternative to embedding time stamps in the 
audio and video streams following demultiplexing, it has 
been proposed to provide time stamp buffers to store the 
time stamps until they are needed. Separate buffers may 
be provided for the time stamps relating to audio access 
xinits and for the time stamps relating to video access 
units. Again, the MPEG standards include no direct 
specification for the size and operation of these buffers. 
However, the current MPEG standards require that the 
system target decoder have a maximxim buffering delay of 
one second for both audio and video. This means that the 
time stamps need only be buffered for a majcimum of one 
second, which enables the maximum size of the time stamp 
buffers to be calculated. If a time stsunp is provided for 
each picture in the video stream, a buffer capacity of 3.0 
time stamps must be provided for the video time stamps. 
Similarly, if a time stamp is provided for each audio 
access unit, a buffer capacity of 115 time stctmps must be 
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provided for the audio time stamps. 
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In tlie manner just described, tlie MPEG standards 
indirectly impose maximiim size on the audio and video 
time stamp buffers. However, this way of setting the 
maximum size of the time stamp buffers has an undesirable 
5 side effect/ namely, it makes the MPEG standards 

unsuitable for use in applications in which a longer 
buffer delay is necessary. For example, the low 
picture-rate, low bite-rate video signal shown in Figure 
13, although otherwise capable of being multiplexed 

10 according to an MPEG-standard bit rate, cannot be 

multiplexed by the MPEG standard because it requires a 
decoder buffer delay of about 5 seconds. 

Since the MPEG standards are meant to be used in many 
applications, it is desirable to eliminate the maxi mum 

15 delay requirement defined by the MPEG standard and to 

establish instead a more rational way of defining the time 
stamp buffer sizes. 

Disclosure of Invention 

The present invention provides a method of generating 
a bit stream by multiplexing non-compressed auxiliary 
information with an information stream. The information 
stream is obtained by compressing fixed- size units of an 
information signal with a varying compression ratio to 
provide varying- sized units of the information stream. 
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The auxiliary information is for use in siibsequently 
decoding the information stream. Units of the auxiliary 
information correspond to the units of the information 
signal. In the method, the information stream is divided 
5 in time into information stream portions. The 

non-compressed auxiliary information is also divided in 
time into auxiliary information portions . The information 
stream portions and the auxiliary information portions are 
interleaved to provide the bit stream. Finally, the 

10 information stream dividing, auxiliary information 

dividing, and interleaving steps are controlled by 
emulating decoding of the bit stream by a hypothetical 
system target decoder. The hypothetical system target 
decoder includes a demultiplexer that demultiplexes the 

15 bit stream, a serial arrangement of an information stream 

buffer and an information stream decoder, and a serial 
arrangement of an auxiliary information buffer and an 
auxiliary information processor. Each serial arrangement 
'is connected to the demultiplexer. The information stream 

20 dividing, auxiliary information dividing, and 

interleaving steps are controlled such that the 
information stream buffer and the auxiliary information 
buffer neither overflow nor underflow. 

The demultiplexer receives the bit stream and extraefee 

25 from the bit stream the information stream and the 
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auxiliairy information for feeding to ttie information 
stream buffer and the aiixiliairy information buffer, 
respectively. The information stream buffer and the 
auxiliary information buffer respectively have a first 
5 target size and a second target size. The information 
stream decoder removes the varying- sized units of the 
information stream from the information stream buffer at a 
first target timing, and the auxiliary information 
processor removes the corresponding fixed- sized units of 
10 the auxiliary information from the auxiliary information 

buffer at a second target timing. 

According to the method, when the bit stream is a- 
multi-layered bit stream, the interleaving step may - 
interleave the information stream portions and the 
15 auxiliary information portions in the same one of the 
layers of the bit stream, or may interleave the 
information stream portions and the aoixiliary information 
portions in different layers of the bit stream- 
The auxiliary information may be directory 
20 information for the information stream, in which case, the 
information stream may include plural access points, and 
each unit of the directory information would relate to one 
of the access points. The information stream may comprise 
plural access units, and the aiixiliary information may be 
25 a set of time stamps for decoding the access units of the 
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information stream. 

^ Tlie present invention also provides an encoder for 
generating a iDit stream. The encoder includes a 
compressor that compresses fixed-sized units of an 



provide varying-sized \inits of an information stream. An 
information stream divider means divides the information 
stream in time into information stream portions. An 
auxiliary information divider divides non-compressed 
atixiliary information in time into axixiliary information 
portions. The auxiliary information is for use in 
subse<3[uently decoding the information stream. Units of 
the auxiliary information correspond to the units of the 
information signal. A multiplexer secjuentially arranges 
the information stream portions and the auxiliary 
information portions to provide the bit stream. The 
multiplexer includes a controller that controls the 
information stream divider and the axixiliary information 
divider by emulating decoding of the bit stream by a 
system target decoder. The system target decoder includes 
a demultiplexer that demultiplexes the bit stream^ a 
serial arrangement of an information stream buffer and an 
information stream decoder, and a serial arrangement of an 
auxiliary information buffer and an axixiliary information 
processor. Each of the serial arrangements is connected 
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information signal with a varying compression ratio to 
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to the multiplexing means, Tlae controller controls the 
information stream divider and the auxiliary information 
divider such that the information stream buffer and the 
information stream decoder neither underflow nor overflow. 
5 The present invention also provides a system in which 

an information signal is compressed for transfer, together 
with non-compressed auxiliary information, to a medium as 
a bit stream and in which the bit stream is transferred 
from the medium and is processed to recover the 

10 information signal by expansion, and to recover the 

auxiliary information. The auxiliary information is for 
use in recovering the information signal. The system- 
comprises an encoder and a decoder, - 
The encoder comprises an information signal 

15 compressor that provides an information stream by 

compressing fixed-sized units of the information signal a 
varying compression ratio to provide varying- sized units 
of the information stream. The encoder also includes an 
multiplexer that sequentially arranges time-divided 

20 portions of the information stream and time-divided 

portions of the non-compressed auxiliary information to 
provide the bit stream for transfer to the medium. The 
multiplexer includes a controller that determines the 
division of the information stream and of the aixxiliary 

25 information into the respective time-divided portions by 
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emulating decoding of the bit stream by a hypotlietical 
system target decoder. The hypothetical system target 
decoder includes a demultiplexer that demultiplexes the 
bit stream, a serial arrangement of an information stream 
5 buffer and an information stream decoder, and a serial 

arrangement of an auxiliary information buffer and an 
axxxiliary information processor. Each serial arrangement 
is connected to the demultiplexer. 

The decoder is similar to the system target decoder 

10 and includes demultiplexer that extracts the information 

stream and the auxiliary information from the bit stream 
transferred from the medixam. A first input buffer 
receives the auxiliary information from the demultiplexing 
means, and a circuit removes a unit of the auxilia2ry 

15 information from the first input buffer. The first input 

buffer has a size of at least the size of the auxiliary 
information buffer. A second input buffer receives the 
information stream from the demultiplexing means. The 
second input buffer has a size of at least the size of the 

20 information stream buffer. A decoder removes one of the 

varying- sized units of the information stream from the 
second input buffer and for expands the removed unit of 
the information stream to recover a unit of the 
information signal. 

25 The present present invention also provides a decoder 



wo 94/30014 ^ PCT/JP94/00942 

46 

for a bit streain obtained by multiplexing non-compressed 
auxiliary information with an information stream. The 
information stream is obtained by compressing fixed- size 
units of an information signal with a varying compression 
5 ratio to provide varying- sized units of the information 

stream. The auxiliary information is for use in 
subsequently decoding the information stream. Units of 
the auxiliary information correspond to the units of the 
information signal. The decoder comprises a demultiplexer 
10 that extracts the information stream and the axixiliary 
information from the bit stream. A first input buffer 
receives the auxiliary information from the demultiplexer, 
and a circuit removes a unit of the auxiliary information 
from the first input buffer means. A second input buffer 
15 receives the information stream from the demultiplexer. A 
decoder removes one of the varying- sized u^its of the 
information stream from the second input buffer means and 
expands the removed unit of the information stream in 
response to the unit of the auxiliary information to 
20 recover a unit of the information signal. 

The present invention further provides a method of 
deriving a multiplexed bit stream from an information 
signal. In the method, an encoder is provided. The 
encoder includes a compressor that compresses units of the 
25 information signal to provide access units of an 
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information stream. A first buffer having a first size 
buffers the access units of the information stream. A 
circuit provides a time stamp each time the first buffer 
receives an access unit of the information stream. A 
5 second buffer having a second size buffers the time 

stamps . A multiplexer multiplexes the information stream 
and the time stamps to provide the multiplexed bit stream. 

A hypothetical system target decoder for decoding the 
multiplexed bit stream is defined. The hypothetical 

10 system target decoder includes a demultiplexer for 

demultiplexing the bit stream, a serial arrangement of an 
information stream buffer and an infoimation stream 
decoder, and a serial arrangement of a time stamp buffer 
and a time stamp processor. Each serial arrangement is 

15 connected to the demultiplexer. The size of the first 

buffer and the size of the second buffer are determined by 
emulating decoding of the bit stream using the 
hypothetical system target decoder. Then, the information 
signal is encoded using the encoder with the size of the 

2 0 first buffer and the size of the second buffer set to the 

respective sizes determined by the determining step. 

Finally, the present invention provides a method for 
deriving a bit stream from an information signal. In the 
method, units of the information signal are compressed to 

25 provide units of an information stream. The units of the 
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information stream include access points. Pointers 
pointing the access points in the information stream are 
derived from the information stream. Then, the 
information stream divided into information packets is 
5 multiplexed together with pointer packets to provide the 
bit stream. The multiplexing is performed such that a set 
of information packets containing plural consecutive 
access points is multiplexed adjacent a pointer packet 
containing the pointers pointing only to the plural 
10 consecutive access points. 

Brief Description of Drawings 

Figure 1 is a block diagram of an encode/decode system 
for an audio signal and a video signal showing the 
relationship between the system and a system target 
15 decoder according to the prior art. 

Figure 2 shows the structure of the multiplexed bit 
stream produced by the encoder of the system shown in 
Figure 1. 

Figure 3 shows the structure of the decoder of the 
20 system shown in Figure 1. 

Figure 4A shows the audio input buffer and the audio 
decoder in the decoder of the system shown in Figure 1. 

Figure 4B is a bit index curve showing the average bit 
index at the input of the audio input buffer in the 
25 decoder of the system shown in Figure 1. 
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Figrure 4C is a bit index curve showing the actual bit 
index at the input of the audio input buffer in the 
decoder of the system shown in Figure 1, 

Figure 4D is a bit index curve showing the bit index 
5 at the output of the audio input buffer in the decoder 

operation of the system shown in Figure 1. 

Figure 5A shows the video input buffer and video 
decoder in the decoder of the system shown in Figure 1. 

Figure 5B is a bit index curve showing the average bit 
10 index at the input of the video input buffer in the 

decoder of the system shown in Figure 1. 

Figure 5C is a bit index curve showing the actual bit 
index at the input of the video input buffer in the 
decoder of the system shown in Figure 1, 

Figure 5D is a bit index curve showing the bit index 
at the output of the video input buffer. 

Figure 6 A shows ideal buffering in the video input 
buffer in the decoder of the system shown in Figure 1. 

Figure 6B shows the effect of a changing input bit 
rate on the buffering provided by the video input buffer 
in the decoder of the system shown in Figure 1. 

Figures 7A, 7B/ and 7C show various ways of remedying 
buffering errors in the video input buffer in the decoder 
of the system shown in Figure 1. 

Figures 8A, 83, 8C, and 8D show the effect of the 
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buffering start up delay on the buffering provided by the 
video input buffer in the decoder of the system shown in 
Figure 1. 

Figure 9 shows the relationship between the structure 
5 of the multiplexed bit stream and the operation of the 

video input buffer in the decoder of the system shown in 
Figure 1. 

Figure 10 shows the relationship between various types 
of picture encoding and the operation of the video input 
10 buffer in the decoder of the system shown in Figure 1. 

Figure llA shows an alternative structure for the 
decoder of the system shown in Figure 1, in which, after 
demultiplexing the multiplexed bit stream, the respective 
time stamps are embedded into the video and audio streams. 
15 Figure IIB shows the audio and video streams with 

embedded time stamps produced by the decoder shown in 
Figure llA. 

Figures 12A to 12B show the effect of the known way of 
multiplexing directory packets into the multiplexed bit 
20 stream on the fast- forward operation of a video tape 
recorder . 

Figure 13 shows a low-bit rate that cannot be decoded 
using a decoder conforming with the buffering delay limit 
imposed by the MPEG-1 standard. 
25 Figure 14 is a block diagram of a first embodiment of 
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an encode /decode system according to the invention for an 
audio signal and a video signal/ showing the relationship 
between the system and a first enLbodiment of a system 
target decoder according to the invention. 
5 Figure 15 shows the structure of a first embodiment of 

an encoder according to the invention showing the 
reference of various element of the encoder to the system 
target decoder according to the invention. 

Figure 16A shows the preliminary multiplexed bit 

10 stream generated by the encoder shown in Figure 15. 

Figure 16B shows the multiplexed bit stream generated 
by the encoder shown in Figure 15. 

Figure 17 is a block diagrcon of a first embodiment of 
a decoder according to the invention. 

15 Figure 18 shows the bit index at the input of the 

video input buffer and at the input and the output of the 
directory input buffer in the first embodiment of the 
decoder shown in .Figure 17 . 

Figure 19 shows the relationship between the structure 

20 of the multiplexed bit stream produced by the first 

embodiment of the encoder shown in Figure 15 and the bit 
indices of the input of the video input buffer and the 
input and the output of the directory input buffer in the 
first embodiment of the decoder shown in Figure 17. 

25 Figure 2 0 shows the effect of the way of multiplexing 
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directory packets into the multiplexed bit stream 
according to the invention on the fast-forward operation 
of a video tape recorder. 

Figure 21 is a block diagram of a second e^O^odiment of 
an encode/decode system according to the invention for an 
audio signal and a video signal, showing the relationship 
between the system and a second embodiment of the a target 
decoder according to the invention. 

Figure 22A shows the structure of a second eniDodiment 
of an encoder according to the invention showing the 
various operational parameters of the encoder determined 
by reference to the second embodiment of the system target 
decoder according to the invention. 

Figure 22B is a block diagram illustrating the process 
by which the operational parameters of the encoder shown 
in Figure 22A are determined with reference to the second 
embodiment of the system target decoder according to the 
invention. 

Figure 23 is a block diagram of a second embodiment of 
a decoder according to the invention. 

Figure 24A illustrates the components of the total 
video delay of the encode /decode system. 

Figure 24B illustrates the components of the total 
video delay and the total audio delay of the encode/decode 
25 system according to the invention. 
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Figure 25 shows the relationship between the structure 
of the multiplexed bit stream produced by the first 
embodiment of the encoder shown in Figure 22A and the bit 
indices of the input of the video input buffer and the 
5 input and the output of the video time stamp buffer in the 

second embodiment of the decoder shown in Figure 23. 

Best Mode for Carrying Out the Invention 

The present invention expands the definition of the 

10 system target decoder (STD) to include an input buffer and 

a decoder for each stream of non-compressed auxiliary 
inf ormation, such as time stamps and directory 
information, in addition to the input buffer and decoder 
for the audio stream and the input buffer and decoder for 

15 the video stream. As a conse<iuence of the redefined STD, 

a practical decoder according to the invention will 
include an input buffer and a decoder for each stream of 
auxiliary information in addition to the respective input 
buffer, and decoder for each of the audio stream and the 

20 video stream. Finally, an encoder according to the 

invention multiplexes the audio stream, the video stream, 
and each of the auxiliary information streams taking 
account of the parameters of the modified STD according to 
the invention. 

25 This approach allows many different types of auxiliary 
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.„e» proviaea t.a. <a) an inpu. .u«er ana a aecoa.. .s 
pJoviaee in t.e svs.e. ta«e. aecoaer eac. auxiXia^ 
.n«or^.ion s«ea., an* ea=. auxiliary in«c»a.icn 
3„ea. is incluaea in .ultipXaxea bit Btrea. sue. t.at 
.ona Of tl.e input £era in t.e STO overflows or 
underflows. 

i first «,iboai»ent cj an encoae/aecoae signal 
processing syste. 10 accoraing to t.e invention, in ».i=. 
a airectory input .uf f er ana a airectory aecoaer are 
proviaea accoraing to t^ invention in t.e syste. target 
aecoaer, is sUown in Figure 14. 

xn t^is, t^e encoaer 1 receives the viaeo signal « 
.ro. t.e Viaeo signal storage meaiu. 2. ana receives the 
auaic signal S3 fro. t>.e auaio signal storage .eaiun 3. 

auaio si^al S3 couia alternatively be <ana is »ore 
usually, also receivea fro. t.e viaeo signal storage 
.eaiu. 2 insteaa of fro. a separate auaio storage .eaiu.. 

encoder 1 ccpr ana coaes the viaeo ana audio 

Signals, ana ^Itiplexes the resulting auaio strea. ana 
viaeo stream to proviae the .ultiplexea bit strea. SI. 

>^ ^nv medium suitable for storing or 
The medium can be any meaium b 

*«-r- *»xample, a CD-ROM, 
distributing a digital bit stream, for examp 

fane a magneto-Optical (MO) 
a laser disk (LD) , a video tape, a magn 



15 



20 



25 



11 1-1 »430014A1JL;» 



wo 94/30014 ^ ^ PCT/JP94/00942 

55 

storage medixim, a digital compact cassette (DCC), a 
terrestrial or satellite broadcasting system, a. cable 
system, a fibre-optic distribution system, a telephone 
system, an ISDN system, etc. 
5 The encoder 1 compresses and codes the video signal 

picture-by-picture. Each picture of the video signal is 
compressed ±n one of three compression modes. A picture 
compressed in the intra-picture compression mode is called 
an I-picture. In the intra-picture compression mode, the 

10 picture is compressed by itself without reference to other 

pictures of the video signal. Pictures compressed in the 
inter-picture compression mode are called P-pictures or 
B-pictures. A P-picture is compressed using forward 
prediction coding using as a reference picture a previous 

15 I-picture or P-picture, i.e., a picture occurring earlier 

in the video signal. A B-picture is compressed using 
bidirectional prediction coding. Each block of the B- 
picture may use as a reference block any one of the 
. following: a block of a previous I-picture or P-picture^ 

20 a block of a following P-picture or I-picture (i.e., a 

picture occurring later in the video signal), or a block 
obtained by performing linear processing on a block of a 
previous I-picture or P-picture and block of a following 
I-picture or P-picture. Typically, about 150 Kbits (Kb; 1 

25 Kb = 1024 bits) of the video stream are re<3[uired for an 
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I-pictuire, 75 Kb of the video stream are recjuired for a 
P-picture, and 5 Kb of the video stream are reouired for a" 
B-picture. 

The digital video and audio processing system 10 also 
5 includes the decoder 6, which receives as its input signal 
the bit stream S5 from the medium 5. The decoder 6 
performs demultiplexing inverse to the multiplexing 
performed by the encoder 1. The decoder performs 
processing complementary to that performed by the encoder 
10 1 to decode and expand the resulting audio stream and 

video stream to provide the recovered video signal S6A and 
the recovered audio signal S6B respectively. The 
recovered video signal S6A and the recovered audio signal 
S6B closely match the video signal S2 and the audio signal 
15 S3 fed into the encoder 1. 

Figure 14 also shows the system target decoder (STD) 4 . 
which is used to define the processing performed by the 
encoder 1 and the decoder 6 . In practical video and audio 
signal processing systems, the encoder does not include an 
20 actual system target decoder, but instead performs the 

encoding processing and multiplexing taking account of the 
system target decoder parameters. Also, in practical 
systems, the decoder is designed taking the system target 
decoder parameters into account. These relationships 
25 between the system target decoder and the encoder and the 
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decoder are indicated in Figure 14 by the broken line 
labelled S4A interconnecting the system target decoder 4 
and the encoder 1, and the broken line labelled S4B 
interconnecting the system target decoder 4 and the 
5 decoder 6. 

The system target decoder 4 includes a reference video 
decoder, a reference audio decoder, and their respective 
input buffers. In addition, the system target decoder 
includes a directory decoder and an input buffer for the 

10 directory decoder. The size of the audio input buffer, 

the size of the video input buffer, and the operation of 
the audio and video decoders are defined by the MPEG 
standards. Jn addition, the invention defines the size of 
the directory buffer and the operation of the directory 

15 decoder to make them compatible with the sizes of the 

other buffers and the operation of the other decoders 
defined by the MPEG standard. 

As mentioned above, the concept of the system target 
decoder provides compatibility between encoders and 

20 decoders of different designs as follows. All encoders 

are designed to provide a bit stream that can be 
successfully decoded by the system target decoder, and 
that does not cause the respective input buffers in the 
system target decoder to overflow or underflow. In 

25 addition, all decoders are designed taking the system 
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.ar,et decoaer para.e«r. in.o account. As a resul., all 
sue. d.codera will be =apal.l. =f euccesBfully aecodin, t>.e 
bit stream produced by any o£ the encoders designed to 
produce a bit stream capable of being decoded by tbe 
system target decoder. By including a directory buffer 
and a directory decoder in tbe STB, tbe invention enables 
encoders and decoders to be made compatible witb one 
another In an additional respect, namely, tbat ol 
providing and decoding directory information. 

Tbe structure of tbe bypotbetical system target 
decoder 4 sbown in Figure 14 is as follows. Tbe 
demultiplexer 41 notionally receives tbe bit stream SI 
from tbe encoder 1. Tbe demultiplexer 41 demultiplexes 
tbe bit stream into a video stream SIV, an audio stream 
Sl^. and a directory stream SIB. Tbe video stream is fed 
to the video input buffer 42, tbe output of which is 
connected to the video decoder 45. Tbe audio stream from 
tbe demultiplexer 41 is fed into tbe audio input buffer 
43. the output of Which is connected to the audio decoder 
«. The directory stream from tbe demultiplexer 41 is fed 
into tbe directory input buffer 44, tbe output of which is 
connected to the directory decoder 47 . 

in tbe example shown in Figure 14 . the video input 
buffer 42 and the audio input buffer 43 have tbe 
respective storage capacities defined by the HPEG 
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standards, namely, 46K bytes and 4 Kbytes in the MPEG-1 
standard. The directory input buffer 44 according to the 
invention has a storage capacity of IK bits, so that it 
will hold 10 directory entries. This capacity is of the 
same order as, but is larger than, the directory buffer 
capacity currently used. These capacities are set in 
consideration of the practical constraints imposed by 
providing the real decoder 6 using a processor that cannot 
include a large amount of storage. 

The video decoder 45 removes the video stream from the 
video input buffer 42 one video access unit at a time, 
i.e., one picture at time, at a timing corresponding to 
the picture rate of the video signal, e.g., once every 
1/29.94 seconds in an NTSC system. The amount of the 
video streajn removed from the video input buffer for each 
picture varies because of the different amount of 
compression applied to each picture. 

The audio decoder 46 removes the audio stream from the 
audio input buffer 43 one audio access unit at a time 
20 predetermined timing. 

The directory decoder 47 removes the directory stream 
from the directory input buffer one directory entry at a 
time as reciuired. For example, in the fast -forward mode 
described cibove, after the access point at the beginning 
25 of each Group of Pictures is read, the directory decoder 
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re»oveB fro. rbe airectorv i=Put buffer «e direotorv 
s„ea. =£ tha directory eatry indio^ins the location of 
tL access point at tie beginning of the ne.t Group of 

Pictures . 

The structure o£ an embodiment of the encoder 1 
according to the invention is shown in Figure 15. The 
encoder generates a multiplexed bit stream from an audio 
signal and a video signal for feeding to the medium 5. The 
encoder also includes directory information in the 
multiplexed bit stream to enable program selections to be 
located, and to enable pictures to be displayed in fast 
forward and fast rewind operations, m the multiplexed 
bit stream, each directory pacl^et of directory information 
must be located ahead of the video packets containing the 
video stream to which the directory entries in the 
directory paclcets belong. However, the directory entries 
in the directory packet are generated from the video 
stream following the directory packet. Therefore, the 
directory entries ^st be added to the directory packets 
after the video signal has been encoded and multiplexed 
into the multiplexed bit stream. The encoder 1 ca.. -my 
do this in one pass if the medium 5 has a random ac.,.5S 
capability (such as a hard disk, so that the medium can 
■ occasionally go back to write the directory entries into 
the directory packets. If the medium 5 does not have a 
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random access capability, or if tlie medium 5 is a 
transmission medium, the encoder can provide tlie 
multiplexed bit stream including directory entries in two 
passes. As an example, an embodiment of the encoder will 
5 be described that provides a multiplexed bit stream in two 

passes for recording on the master tape from which 
distribution media (such as video tapes or video discs) 
are manufactured. 

Zn the encoder 1, the digital video signal S2 is fed 

10 into the video encoder 201, and the digital audio signal 

S3 is fed into the audio encoder 202. The video stream and 
the audio steam from the video encoder 201 and the audio 
encoder 202, respectively, are fed, after internal 
buffering (not shown) into the multiplexing circuit 203. 

15 The output of the multiplexing circuit 203 is connected to 

the digital storage medium (DSM) 210, where the resulting 
preliminary multiplexed bit stream is temporarily stored. 
The multiplexer 203 assembles the preliminary 
- multiplexed bit stream by time multiplexing the elementary 

20 streams, i.e., the video stream, the audio stream, and a 

directory stream of dummy directory entries, into packets, 
and the packets into packs. The multiplexer also adds the 
multiplexing layer, i.e., the packet header for each 
packet, and pack header for each pack. The multiplexer 

25 2 03 receives the headers from the header generator 204, 
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ane receives t^e au^ airec.o^ entries £x=» «e du^y 
directory entry generator 205. 

' The multiplexer 203 also feeds the preliminary 
multiplexed bit stream to the directory entry generator 
231, which counts the bit index of the preliminary 
multiplexed bit stream and detects the access point at the 
beginning of each Group of Pictures to generate a 
directory entry for each access point. The directory 
entry generator assembles the directory entries into a 
directory stream, which it feeds to the directory storage 
medium 233 for storage. 

The directory entry counter 235 tracks the state of 
the directory input buffer « in the system target decoder 
4. The directory entry counter monitors the output of 
dummy directory entry 205 fed to the multiplexer 203. 
Each dummy directory entry fed into the multiplexer 203 
increments the directory entry counter by one. The 
directory entry counter 235 also monitors the output of 
the directory entry generator 231 fed to the directory 
stream storage medium 233. Each directory entry 
decrements the count of the directory entry counter by 
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A preset limit is applied to the directory entry . 
counter 235 according to the size of the directory input 
suffer 46 in the system target decoder 4. When the count 
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of tiie directory entry counter reaches the preset level, 
indicating that the directory input buffer is full, the 
directory entry counter feeds a buffer full interrupt to 
the dummy directory entry generator 205, The buffer full 
5 interrupt stops the dummy directory generator from feeding 

dummy directory entries to the multiplexer 203, When the 
directory buffer has a capacity of 1 kbits, the preset 
limit corresponds to ten dummy directory entries. When 
the count of the directory entry counter 235 indicates 
10 that the directory input buffer 46 is empty, the directory 

entry counter feeds the buffer empty interrupt to the 
multiplexer 203 to cause the multiplexer to insert another 
dumnry directory packet into the preliminary multiplexed 
bit stream. 

15 During second step of the encoding process, in which 

the directory entries are written over the dummy directory 
entries in the preliminary multiplexed bit stream to 
provide the multiplexed bit stream, the digital storage 
medium 210 feeds the preliminary multiplexed bit stream 

20 and the directory storage medium 233 feeds the directory 

stream to the directory stream insertion circuit 250. Tha 
directory stream controller 255 monitors the preliminary 
bit stream read out from the digital storage mediim 210 to 
determine the locations in the preliminary bit stream of 

25 the directory packets into which the directory stream is 
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„ be ineorrad. When depots eacl. directory packet 
header, the directory stream controller feeds the 
directory BtreaB insert control signal to the directory 
stream insertion circuit and the directory stream storage 
medium. The directory stream counter 258 determines the 
number of directory entries inserted into the directory 
packet, and causes the directory stream controller to 
Change the state o£ the directory stream insert control 
signal when the directory packet is full. 

The video encoder 201, the audio encoder 202, the 
multiplexer 203. the directory entry counter 235, and the 
directory stream counter 258 are all designed to provide a 
preliminary multiplexed bit stream that, when notionally 
decoded by the system target decoder 4, causes none of the 
input buffers 42, 43, and 44 In the system target decoder 
to overflow or underflow. This relationship is indicated 

by the dotted line S4A. 

The encoder 1 operates as follows. « the beginning 
Of the recording, the ^iltiplexer 203 turns to the header 
generator 204 to receive all the headers for the start of 
the recording, and feeds these headers to the DSM 210. 
The multiplexer then receives from the header generator 
the pack beader for the first pack in the recording, 
followed by the packet header for the first packet. The 
first packet is a directory packet, since the first packet 



15 



20 



25 



1D:WI«»_»*W)1M1JJ» 



10 



wo 94/30014 

PCT/JP94/00942 



65 

of the recording is a directory packet. 

The multiplexer 203 then turns to the dimmy directory 
entry generator 205, and feeds dummy directory entries 
from the diommy directory entry generator to the DSM 210. 
Each dummy directory entry fed to the multiplexer 
increments the directory entry counter 235 by one. When 
the count of the directory entry counter reaches the 
preset limit corresponding to the nximber of directory 
entries that can be accommodated in the directory input 
buffer 46 in the system target decoder 4, the directory 
entry counter feeds the buffer full interrupt to the dummy 
directory entry generator 2 05, which causes the dumny 
directory entry generator to stop feeding directory 
entries into the multiplexer. 

After it has fed the directory packet full of dummy 
directory entries to the DSM 210, the multiplexer 203 
turns back to the header generator 204 to receive the 
packet header of the first video packet, which it feeds to 
the DSM 210. Then, taking the respective states of the 
video input buffer 42 and the audio input buffer 43 in the 
system target decoder 4 into accoxint, the multiplexer then 
multiplexes the video stream and the audio stream together 
to provide video packets and audio packets which it feeds 
to the DSM 210. 

During this process, the directory entry generator 231 
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..e .ux.iple.a. .03 « ... -<> " ^^^^ 
pl^n. in ..e .it s„e». -=ess point is an access .nit 
.^at is capable o. .sin. aecoded on its own, without t.e 
neea to decode other access unite in t.e .it stream. For 
exa^le. a video access point is a picture t.at is 
compressed wnollv or partiail. usin, intra-picture codin,. 

... etrear>s, an access point occurs at t.e .e.innin, o« 
eac. croup of Pictures, ^e directory entrv generator .31 
a.so counts t.e .it inde. o. t.e preli^rv .ultipiexed 
... etrea.. .a- ti.e it detects an access point in t.e 
p^eli^narv .uitiple^ed .it stream. t.e directory entrv 
venerator converts t.e .it inde. of t.e access point into 
a relative address on t.e finai storage medium, i-- 
viaeo cassette in t.is e.a.pXe. T.e directory entry 
venerator t.en creates a directory entry for t.at access 
point. v.ic. it feeds to the directory entry storage 
.ediu. 233 for storage as a unit of the directory stream. 

-j-^? decrements its co\int 
Tbe directory entry counter 235 fiecrem 

.or eac. directory entry generated .y t.e directory entry 

generator 231 and fed to t.e directory entry storage 

.ediu. 233. V^en t.e state o. t.e directory entry counter 

v^iiffer 44 of tlie system 
corresponds to the directory xnput buffer 

♦-v^ HTT-e»ctory entry counter 
i target decoder 4 being empty, tbe directory 
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235 providLes ttie buffer empty interrupt to tlie multiplexer 
203. 

Tlie buffer empty interrupt indicates to the 
multiplexer 2 03 that the multiplexer has received all of 
5 the access points whose directory entries will be stored 

in the preceding directory packet (in this example, the 
directory packet at the beginning of the pack), and that 
it must include axiother directory packet in the 
preliminary multiplexed bit steam before the next access 

10 point in the video stream. Accordingly, in response to 

the buffer empty interrupt, the multiplexer 203 completes 
the current video packet, and the following audio packet, 
if any. After this, the multiplexer turns to the header 
generator 2 04 to receive a directory header, which it 

15 feeds to the DSM 210. The multiplexer then turns to the 

dummy directory entry generator 205, and feeds dummy 
directory entries from the dummy directory entry generator 
to the DSM 210 until it receives the buffer full interrupt 
from the directory entry counter 235. The multiplexer 

2 0 then proceeds to multiplex more of the video stream and 

the audio stream, until another buffer empty interrupt 
indicates that another directory packet must be inserted. 
The resulting preliminary multiplexed bit stream recorded 
on the DSM 210 is shown in Figure 16A. 

25 When the preliminary multiplexed bit stream and the 
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alrecor^ encries .=r .ha .^ola recorain^ are .eepec.ivelv 
s„red on the digital storage medi.. 210 and .he directory 
stlrage medium 233, the second pass of the encoding 
process is performed to replace the dummy directory 
entries in the directory packets in the preliminary 
multiplexed bit stream with directory entries from the 
directory stream to provide the multiplexed bit stream. 
The preli^nary multiplexed bit stream is reproduced from 
the DS« 210 from its beginning, and is fed into the 
directory stream insertion circuit 250. The directory 
stream controller 256 monitors the preliminary multiplexed 
bit stream for directory headers. 

Each time the directory stream controller detects a 
directory header, it sends the directory entry insert 
signal to the directory entry storage medium 233 and to 
the directory stream insertion circuit 250, and 
initialises the directory stream counter 258 to the preset 
value discussed above. In response to directory entry 
insert signal, the directory entry storage medium 233 
feeds the directory stream to the directory stream 
insertion circuit 250. The directory stream insertion 
circuit places each directory entry in the directory 
stream into the directory paOcet following the directory 
header in the preliminary ^Itiplexed bit stream. The 
directory stream insertion circuit overwrites the du-y 
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directory entries in the preliminary multiplexed bit 
stream with the directory entries. The directory stream 
insertion circuit feeds the resulting multiplexed bit 
stream to the medium 5 (Figure 14) • 

The directory stream counter 258 monitors the 
directory entries in the directory stream fed to the 
directory stream insertion circuit 250. Each directory 
entry fed to the directory stream insertion circuit 
decrements the directory stream counter by one. When the 
directory stream counter reaches zero, the directory 
stream counter feeds the packet full signal to the 
directory stream insertion controller 256. In response to 
this signal, the directory stream insertion controller 
changes the state of the directory entry insert signal. 
15 This causes the directory entry storage medium 233 to stop 

sending the directory stream to the directory stream 
insertion circuit 250, and causes the directory stream 
insertion circuit to feed the preliminary multiplexed bit 
'Stream out unchanged as the multiplexed bit stream until 
the directory stream controller once more detects a 
directory packet header in the preliminary multiplexed bit 
stream. The resulting multiplexed bit stream fed to the 
medium 5 (Figure 14) is shown in Figure 16B. 

The same basic circuit arrajigement can optionally be 
used to provide pictures in the fast -rewind mode in 
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.ddlrion to tne fast-forward mode. If the sa^e size 
directory input .uffer « is employed iu tne system target 
de'coder 4, controXlia. the .ultiplexin. of t.e directory 
packets aoccrdin, to the state of the directory input 
buffer in the system target decoder 4 results in 
approximately t«i=e the nu^er of directory paCcets ^ing 
inserted into the preli^nary multiplexed bit stream than 
„hen pictures are to he provided only in the fast-forv^rd 
.Ode. This is because each directory paclcet must hold the 
directory entries for the n/2 access points following the 
directory pacXet <£or use in the fast forward mode) and 

^«^^i-fl before tlie directory packet (for 
for the n/2 access points betore tae 

use in the fast rewind mode), where n is the number of 
directory entries that can be stored in the directory 
input buffer 44 in the system target decoder 4. 

Figure 17 Shows the structure of the decoder 6. The 

decoder 6 is designed in consideration of the parameters 
of the system target decoder 4 (.igure 14) to decode the 
multiplexed bit stream shown in Figure 16B produced by the 
encoder 1. As a result, the decoder 6 has a structure very 
similar to that o£ the system target decoder 4. 

The decoder 6 includes the demultiplexer 61, which 
receives the multiplexed bit stream SS from the medium 5. 
.he demultiplexer demultiplexes the multiplexed bit stream 
into the video stream S5V, the audio stream and the 
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directoiry stream S5D. Incidentally, as will be described 
in more detail below, the multiplexer also demultiplexes 
the video time stamps and the audio time stamps (not 
shown) from the multiplexed bit stream. 

The video stream S5V from the output of the 
demultiplexer 61 is fed into the video input buffer 62, 
which precedes the video decoder 65. The audio stream S5A 
from the demultiplexer is fed into the audio input buffer 
63, which precedes the audio decoder 66. The directory 
stream S5D from the demultiplexer is fed into the 
directory input buffer 64, which precedes the directory 
decoder 67. 

The video decoder 65 removes each access unit, i.e., 
picture, of the video stream from the video input buffer 

15 62 for decoding in the order in which the access xinit was 

received by the video input buffer. The audio decoder 66 
removes each access unit of the audio stream from the 
audio input buffer 63 for decoding in the order in which 
the access unit was received by the audio input buffer. 

20 The directory decoder 67 removes each directory entry of 

the directory stream from the directory input buffer 64 in 
the order in which the directory entry was received by the 
directory input buffer. 

The input buffers 62, 63, and 64 will be described in 

25 detail next. It is not possible to decode the elementary 
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cc.ple.ely .a«^in. cloCs. T.e .irs. reason for t..B, « 
ttlt, as mentioned above, the compression ratios 

.vera,e transfer rates of t.e elementarv streams fro» t.e 
„eaium 5 differ from the average input rate of the 
elementary streams to the respective decoders SS. e.. and 
„. depending on the error in the sampling rate cloCs. 
„..eover. the elementary stresms are transferred from the 
medium 5 via the de^ltiplexer 61 intermittently, and the 
decoders demand the access units of their respective 
elementary streams intermittently. Conse^ently, the 

transfer rate of the elementary stre^ns from 
instantaneous transzer i-cil. 

the medium 5 and the instantaneous input rate of the 
elementary streams into their respective decoders do not 
.atch. Therefore, the input buffers S.. 63. and 6. are 
provided between.the demultiplexer 61 and the respective 
decoders 6S. 66. ^d 6, to accommodate the differences in 
the average transfer rate and the average input rate, and 
.a the instantaneous transfer rate and the instantaneous 
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input rate. 

Shoeing the ti»e dependency of the transfer of the video 
.tream SSV in the multiplexed signal free the medium S 
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into ttLe video input buffer at first, because tlie 
demultiplexer 61 first feeds the directory stream into tbe 
directory buffer 64. Then, following the first video 
packet header in the multiplexed bit stream, the 
demultiplexer transfers the video stream in the following 
video packet (s) into the video input buffer 62 at a 
substantially constant bit rate until it encounters the 
next directory packet header in the multiplexed bit 
stream. In response to the directory packet header, the 
demultiplexer interrupts feeding the video stream into the 
video input buffer while it feeds the directory stream in 
the directory packet into the directory input buffer 64 • 
During this interruption, the bit index of the video 
stream remains unchanged. At the end of the directory 
packet, in response to the packet header of the first 
following video packet, the demultiplexer resumes 
transferring the video stream contained in the video 
packet (s) into the video input buffer until it encounters 
another directory packet header in the multiplexed bit 
stream. This process is repeated throughout the decoding 
process. The bit index at the output of the video input 
buffer is the same as that shown in Figure 5D. 

Transfer of the video stream into the video input 
buffer 62 is also interrupted when the multiplexer 
25 encounters a audio packet header in the multiplexed bit 
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nUe .C. Tl^ese interruptions occur »ore f recently t.»n 
the interruptions to transfer the directory stream, but 
they have been omitted trom .igure IS to sl-pli^y the 

drawing • 

,i^e 18 Shows in its lower part a hit inde. curve of 
the time dependency o£ the transfer of the directory 
stream SSr> in the multiplexed eional from the medium S 
into the directory input buffer 64. The demultiplexer 61 
detects the directory paCet header at the beginning of 
the multiplexed bit stream and transfers the directory 
access unit contained in the following directory paCet 
..om the medium 5 into the directory input buffer 64. 
following the first directory paCet. the demultiplexer 
ceases transferring the directory stream into the 

the following video pa=^et<s, into the video input buffer 
S2 and the audio stream in the following audio packet.s, 

61 encounters the next directory paclcet header in the 
multiplexed bit stream and feeds the directory stream in 
the directory paCet following the directory paCet header 

4- v>„-P€er This process is repeated 
into tHe directory input buffer, tuxs p 

25 throughout the decoding process . 
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The lower part of Figure 18 also shows the bit index 
of the output of the directory input buffer 64. The 
initial transfer of directory stream into the directory 
input buffer at the beginning of the multiplexed bit 
stream fills the directory input buffer to capacity. 
Then, as the video stream is received, the directory 
decoder 67 removes directory entries one-by-one from the 
directory input buffer until the directory input buffer is 
empty. However, because the multiplexed bit stream has 
been constructed to take account of the operation of the 
directory input buffer and the directory decoder, another 
directory packet occurs in the multiplexed bit stream 
before the next access point. As a result, the directory 
stream, in the next directory packet is transferred into 
15 the directory input buffer (a) when the directory input 

buffer is empty, so that transferring the directory stream 
into the directory input buffer does not cause the 
directory buffer to overflow, and (b) before the directory 
decoder attempts to remove another directory entry from 
the directory input buffer, so that removing the next 
directory entry does not cause the directory input buffer 
to underflow. 



Figure 19 shows how the bit indices shown in Figure 18 
relate to the multiplexed bit stream produced by the 
25 encoder 1 (Figure 14). In Figure 19, the directory 
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..e air.«o„ s...^ into t.e ..rec.orv i=P« 

^X..pxexed stre.. are li^ea « c>>e «a.s*er of 

..o.en lines. .l=o, transfer of t^e access point at the 
^.i:u.i« o. eac. ,roup of Pictures into t.e viaeo input 
.u«er 6. is linked to t.e re^vai of t.e directory entry 
for t^at access point fro. t.e directory input .uffer by 
stxai.« .ro.en Xines interconnecting t.e .it index curve 

directory input buffer 64. 

figure 20 sbows the beneficial effect on the fast 
for^rd operation of a video tape recorder of tbe rational 
sizing and placement of tbe directory packets in t^ 
^Itiplexed bit stream resulting fro. using tbe modified 
system target decoder according to tbe invention to 
control tbe multiple^ng of t.e multiplexed bit stream, 
^be resulting sizing of tbe directory paclcets in tbe 
multiplexed bit stream ensures tbat eacb directory pac.et 
contains only tbe number of directory entries tbat can be 
accommodated in tbe directory input buffer 44 of tbe 
.ystem target decoder, and, bence. in tbe directory input 
.uffer 64 Of tbe decoder 6. Tbe resulting placing of tbe 
directory packets in tbe multiplexed bit stream ensures 
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that the directory entries contained in each directory- 
packet belong only to the access points in the video . 
stream in the video packets following the directory packet 
and before the next directory packet, Conse<iuently, 
5 Figure 2 0 differs from Figures 12A to 12E in that the 

video tape recorder does not have to go back several times 
to read the contents of the directory packet. 

During the fast- forward operation illustrated in 
Figure 20, the video tape recorder first reads the 

10 directory packet at the beginning of the multiplexed bit 

stream, and transfers the directory stream to the 
directory input buffer 64, The directory stream fills the 
directory input buffer to capacity. The directory decoder 
67 then removes the first directory entry from the 

15 directory input buffer, and instructs the video tape 

recorder to skip to the address indicated by the first 
directory entry. At that address, the video tape recorder 
reproduces the video stream of the picture at the access 
point, located at that address at the beginning of the 

2 0 zero-th Group of Pictures. The video stream of the 

picture is then decoded for display. 

The directory decoder then removes the second 
directory entry from the directory input buffer, and 
instructs the video tape recorder to skip to the address 

25 indicated by the second directory entry- At that address. 
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th. video t«e recorder reproduces the video s«ea. o£ the 
picture at the access point, located at that address at 
t^e beginning of the first Group of Pictures. The video 
stream ot the picture is then decoded for display. 

The process Just described repeats until the directory 
decoder has removed the tenth directory entry from the 
directory buffer and the picture at the access point at 
the beginning of the ninth Group o£ Pictures has been 
reproduced and displayed. The directory buffer 64 is no» 
empty, and, if the directory decoder 67 attempted to 
remove another directory entry, it would cause the 
directory input buffer to underflow. However, the next 
directory paCet is located before the next access point. 
The video tape recorder reproduces the directory stream 
from the directory packet and transfers it into the 
directory input buffer, which, being empty, can 
accommodate the whole of the directory stream in the 
directory paOcet. The directory decoder then removes the 
. first directory entry from the directory input buffer, and 
instructs the video tape recorder to s.ip to the address 
indicated by the first directory entry. « that address, 
.he video tape recorder reproduces the video stream of the 
picture at the access point, located at that address at 
the beginning of the tenth Group of Pictures. The video 
stream of the picture is then decoded for display. This 
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process repeats until the fast-forward process stops. 

The encoder 1 according to the invention has used the 
modified system target decoder 4 according to the 
invention to size and place the directory packets in the 
5 multiplexed bit stream so that at no time during the 

fast- forward process does the decoder 6 have to attempt to 
remove directory entries from an empty directory input 
buffer (which would result in an underflow of the 
directory input buffer) or to fill the directory input 
10 buffer with directory stream when the directory input 

buffer is not empty (which would result in an overflow of 
the directory input buffer. 

Figure 21 shows a second embodiment of the digital 
video and audio signal processing system lOA according to 
15 the invention, in which a time stamp buffer and a time 
stamp decoder is provided in the modified system target 
decoder 4A according to the invention for each of the 
audio time stamps and the video time stamps. 

Using the modified system target decoder 4A according 
2 0 to the invention, the encoder lA is able to optimize the 

system video stream buffering delay and other encoding 
parameters to generate compliant bit streams with the best 
possible picture cjuality for the rec3:uired video bit rate, 
while keeping the decoder buffering delays as low as is 
25 practical in a one-pass system* 
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11 flie encoder lA 

• ^=,1 S2 from tue video sig» 
• «o 1-lie video signal rruu 
receives the vi . n 03 from tbe audio 

' . nd receives the audio signal S3 
Biedium 2, and recei . , 53 could 

« ^ The audio signal S3 coui 
, „+.«-raae medium 3. ine 
signal storage m received from 

. be (and is more usually) also 
5 alternatively be ^a 

• anal storage medium 2 instead of from 
the video signal sT:or y 

audio storage medium, 
separate audio 

lA compresses and codes the vi 

"^.^."1....--.^ ...... — 

audio signals, an stream 

.ic. is to *e .eelu. 5 tox s««,e or 
eis«l>=uti=n. TUe .edi». can any 

-.„.tal bit stream, for exa»y 
storing or distributing a digital 

„ . laser disk (LD) • a video tape, a 
a CD-ROM, a laser ^^^^^^ 
. 1 mo> storage medium, a digii^a 

cassette (OCC) . .tribution system, 

system, a cs^le system, a .^re-opt.c e.st. 
. „.e.Uone system, an XS:» system, etc ^^^^^^ 
^e encoder 1* compresses an. coees t.e 

«^ tue video signal is 
■pach picture of tne vj. 

- - ; ^ „ s 3-pUture as 

compressed as an 1-picture, a P P 

aescrtted a]»ve. .„„ system lOA also 

^ed.,.ta...deo and and. process. 

..Xndes t.e decoder ».ic. rece.ves^s 
,3 s.^axt.e .it stream SSX.romt.emed.nmS. 
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6A performs demultiplexing inverse to tlie multiplexing 
performed by the encoder lA, The decoder performs 
processing complementary to that performed by the encoder 
lA to decode the resulting audio stream and video stream 
5 to provide the recovered video signal S6A and the 

recovered audio signal S6B. The recovered video signal 
S6A and the recovered audio signal S6B respectively 
closely match the video signal 32 and the audio signal S3 
fed into the encoder lA. 

10 Figure 21 also shows the system target decoder (STD) 

4A which is used to define the processing characteristics 
of the encoder lA and the decoder 6A. In practical video 
and audio signal processing systems, the encoder does not 
include an actual system target decoder, but instead 

15 performs the encoding processing and multiplexing taking 

account of the system target decoder parameters. Also, 
practical decoders are designed taking the system target 
decoder parameters into account to minimize hardware cost, 
etc. These relationships between the system target 

20 decoder and thie encoder and the decoder are indicated in 
Figure 21 by the broken line labelled S4A interconnecting 
the system, target decoder 4A and the encoder lA, and the 
broken line labelled S4B interconnecting the system target 
decoder 4 A and the decoder 6A, 

25 The system target decoder 4 includes a reference video 
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.;te. .ar«.t aecoaer in=.u.eB a video ti-e 3.a», 
...ceaain, ^au.e =S. an au«o ti.e ata^ proceeain, 
^a..e SS. ana ..ei. .eape«i.e .npu. .u..e« 5. and S 

• ^-n of the audio and video 
input buffer, and the operation of tlxe 

«oo.e« a« define. ^ ..e KPKO standarda. aa dea«.>«d 
^„.e. xn addition. t.e invention definea t.e ai«a of 
..e Video ti^e sta-P .uf far and t.e audio ti.e ata^P 
.uffe.. and t.e t^ ata^ =odin, f«,uencv. - 

„e defined to opti^.e t.e utiXi.ation of t.e ot.et input 



10 



buffers. ^.^h^a 
.,ain. aa diaouaaed a^ve. t.e oonoept of t.e .od.f«d 
..ate. ta.,et decode, aocotdin, to t.e invention providea 
.^pati^iUt, between encoded and decoded of different 

1-n tHe audio and video tune 
streams, but also with respect to tHe 

, 3ta.p .ufferin. Xn particular, t.e modified svete. 

..„et decoder according to t.e invention provides t..s 

• .H«ut: the need to impose a maximum on tHe 
compatibility wxtbout tne nee 

..andard to be extended to cover au=. appiicationa aa 
25 Wt-rate video alide abowa and the like. 
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The structure of the hypothetical system target 
decoder 4A shown in Figure 21 is as follows. The 



demultiplexer 41A notionally receives the bit stream SIA 
from the encoder lA. The demultiplexer 41A demultiplexes 
the bit stream into a video stream SlV, an audio stream 
SIA/ video time stamps VTS and audio time stamps ATS. The 
video stream SlV is fed to the video input buffer 42, the 
output of which is connected to the video decoder 45. The 
audio stream from the demultiplexer 41A is fed into the 
audio input buffer 43 , the output of which is connected to 
the audio decoder 46. The video time stamps from the 
demultiplexer 41A are fed into the video time stamp buffer 
52, the output of which is connected to the video time 
stamp processing module 55. The video time stamp 
processing module controls the timing of the decoding of 
the video stream by the video decoder 45. The audio time 
stamps from the demultiplexer 41A are fed into the audio 
time stamp input buffer 53, the output of which is 
connected to the audio time stamp processing module 56. 
The audio time stamp processing module controls the timing 
of the decoding of the audio stream by the audio decoder 



In the example shown in Figure 21, the video input 
buffer 42 and the audio input buffer 43 have the 
respective storage capacities defined by the MPEG 



46. 
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«aneara. T.eee capacities a.e s« in conBiaera.ion o. 

practical constraints impose. ^ proviain, t.e aecoaer ^ 
usin. a processor t.at. because of cost constraints. 
5 cannot have a large amount o£ storage. 

Video aecoaer « removes the video stream .ro« the 
■ Video input huf^er 4. one video access unit at a time, 
i.e.. one picture at time, at a timing corresponding to 

every 1/29.94 seconds in an NTSC 
10 signal, e.g., once every X/ 

s^tem. Xhe amount o. the video stream rem^ed .rem the 

• different amount of compression applied to each picture. 

audio decoder « removes the audio stream from the 
• • ,t buffer 43 o« audio access unit at a time at a 
15 audio input buEEer 

^. - i-Yie audio time stamps and a 
timing corresponding to the audio 

predetermined timing. 

^ structure of the encoder 1* is shown in .igure 
,a*. «=ess units of the video signal S2 are fed to the 

^.T- -JOIA which compresses each 
20 input of the video encoder 201*. who. 

access unit. i.e.. Picture, of the video signal. The 

output huffer 300. where thev it temporarily stored, .he 
„ Video stream from the output of the video output .uffer is 
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fed to ttie multiplexer 2 03A, Feedback from the video 
output buffer to tlie video encoder prevents the output of 
the video encoder from causing the video output buffer to 
overflow, 

5 The audio signal S3 is fed to the input of the audio 

encoder 202A, which compresses it. The resulting audio 
access units are fed from the output of the audio encoder 
to the input of the audio output buffer 302, where they 
are temporarily stored. The audio stream from the output 

10 of the audio output buffer is fed to the multiplexer 203A* 

Feedback from the audio output buffer to the audio encoder 
prevents the output of the audio encoder from causing the 
audio output buffer to overflow. 

The encoder lA also includes the clock signal 

15 generator 305- In the MPEG-1 systems, the frequency of 

the clock signal generator is 90 kHz, in MPEG- 2 systems, 
the frequency is 27 MHz. The output of the clock signal 
generator is fed to the clock counter 307, the output of 
wbich provides a clock reference signal. The clock 

2 0 reference signal has a value that is incremented by one 

each cycle of the clock signal. The clock reference 
signal is connected to the header generator 204. In the 
MPEG- 2 standard, the clock counter 307 also divides MPEG- 2 
clock signal by 300 to provide a time stamp clock 

25 reference signal having a value that is incremented by one 
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^^^^ 

„ ^,..v reference signal to to 

stsBP generator 309. 

,11 and the audio presentation ti» 
generator 3ii. tbe 
, , 313 in the cloc* counter 

= .,..1 to the Video decoding time Bt.^ 

.loc. reference si.naX ^^^^^^^^^ 
■ generator 309, the video presentation 

^fsMon time stamp generator 
311, and the audio presentatxon txm 

« r-iock reference signal, 
as the ti^ stamp cloc. r ^^^^ input o£ 

..stion time st»P generator 3iX. The 
the video presentation tim 

.ntation time stamp generator generates a 
video presentation t picture 

>™t, (ptS) in response to eacn p 
• presentation time stamp (WS) 

and the time stamp clocK 
the video input signal and 

. ^1 ^ presentation time st««.. are 
,3 reference signal. ^,«er 304 to the video 

,.a the time stamp re-ordering hu«er ^^^^ 
time stamp 301. Bach video presentati 

^^r^rk reference signal at 
« the time stamp clock rere 
is tHe value of tne j. «f a 

receives the start of a 
the instant the video encoder rece. 

«-F the video input signal. 
20 picture of the V ^..fer 304 receives a 

The time stamp re-orderxng buffer _ 
Tne oniA eacli time 

.....r fl. Signal from the video enc. ^ ^^^^^ 
..e latter, in the course Of compressing th 
. . .,s2 changes the order Of the access units 

signal S2, cn g -ccess units of 

..lative to the order of the acces 
25 video stream relative 
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ttie video input signal S2 . In response to the re-order 
flag signal, the time stamp re-ordering buffer changes the 
order of the presentation time stamps generated by the 
video presentation time stamp generator 311 to match the 
5 order of the access units of the video stream the video 

encoder feeds into the video output buffer 300. 

The video encoder 201A feeds a flag signal to the 
input of the video decoding time stamp generator 309 at 
the same instant as it feeds the start of an access unit of 

10 the video stream to the video output buffer 300, In 

response to each flag signal and the time stamp clock 
reference signal, the video decoding time stamp generator 
generates a video decoding time stamp (video DTS) , which 
it feeds to the video time stamp buffer 301. The video 

15 decoding time stamp is the value of th.e time stamp clock 

reference signal at the instant the flag signal indicates . 
that the encoder has fed the start of the access ;init of 
the video stream into the video input buffer. 

The video time stamp buffer 301 temporarily stores the 

2 0 video time stamps. The video time stamp buffer also 

receives and stores pointers from the video encoder 201A 
to enable it to relate each video time stamp that it 
receives to the picture header of each video access unit 
stored in the video output buffer 300. The video time 

25 stamp buffer later feeds the video time stamps to the 



OOdD: <WO__>4a0014A1 JL> 



10 



.4 

PCT/JP94/00942 

WO 94/30014 A 



88 

„ tl.e MUlriplexer via tl>e adder 319, wbere rl^ey are 
in=ren,e«sd >=y the value of t.e SE.E0TKO V BUP^ERIHG .E^Y 
<„hl=h will ^e deB=ri,=ed in .ore detail ^elow, . T.e video 
presentation tl.e stamps PTS are fed to the ™.ltiple»x 
Via the adder 321. where the. are incremented ^ the value 
of the total video delay (which will ^ described below) . 
The multiplexer selectively adds the video ti™ eta^s to 
the paOcet headers of the video paCets in the multiplexed 
bit stream according to the occupancy of the video time 
stamp buffer 42 of the system target decoder 4A. 

Tie audio encoder 202A feeds a flag signal to the 
input of the audio presentation time stamp generator 3U 
coincident with it feeding the start of each access unit 
of the audio stream to the audio output buffer 302. m 
response to this flag signal an^ the time stamp clocK 
reference signal, the audio presentation time stamp 
generator generates an audio presentation time stamp, 
which it feeds to the audio time stamp buffer 303. Kach 
audio presentation time stsmp is the value of the time 
stamp Clock reference signal at the instant the flag 
signal indicates that the audio encoder has fed an access 
unit of the audio stream into the audio input buffer. 

The audio tl-e st«ap buffer 302 temporarily stores the 
audio presentation time stamps. The audio time stamp 
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buffer also r-eceives pointers from ttie audio encoder 202A 
to enable it to relate each audio time stamp that it 
receives to the address of the header of each audio access 
unit stored in the audio output buffer 3 02. The audio 
5 time stamp buffer 303 later feeds the audio presentation 

time stamps to the multiplexer 203A. The multiplexer 
selectively adds the audio time stamps to the packet 
headers of the audio packets in the multiplexed bit stream'" 
according to the occupancy of the audio time stamp buffer 
10 43 of the system target decoder 4A. 

The video output buffer 300, video time stamp buffer 
301, audio output buffer 302, audio time stamp buffer 303 
and time stamp re-ordering buffer 304 are all first-in 
first-out (FIFO) buffers. 
15 The time stamp generators 309, 311, and 313 may be 

integrated with their respective video and audio time 
stamp buffers 301 and 302. Moreover, a single clock 
reference signal could be used, and could be divided by 
• 300 in the time stamp generators to provide the time stamp 
2 0 clock reference signal. 

The header generator 2 04 generates the various headers 
of the multiplex layer, i.e.^ the pack headers and the 
various packet headers. The header generator receives the 
clock reference from the clock counter 307, and feeds the 
25 headers into the multiplexer 203A. 



PCT/JP94/00942 



10 



15 



90 



20 



25 



,.^re 23 elbows .^e structure of t^e aeccder 6. in t.e 
encodin,/ae==din, s.ste. 10.. deccaer 6. is de.i,ne. 

coneiaeration of t.. parameters of t.e eyste. target 
aecoaer « (Figure 21) to aacoda t^e multlplexea bit , 
stream produced by the encoder 0.. Xs a result, tb= 
decoder 6. baa a struotura very si^lar to tbat of tbe 
system target decoder 4X. 

The decoder 6A includes the demultiple«r 61A. »bicb 
receives tbe multiplexed bit stream S5 from tbe medium 5. 
,be demultiplexer demultiplexes tbe multiplexed bit stream 

ti^e stamps and tbe audio ti«e stamps S5TA. 

Tbe video stream S5V from tbe output of tbe 

„bicb precedes tbe video decoder 65. Tbe audio stream S5* 
.rom tba demultiplexer is fed into tbe audio input buffer 
63. wbicb precedes tbe audio decoder 66. Tbe video time 
stamps S5TV from tbe demultiplexer is fed into tbe video 
•tl^e sta™ buffer 72. Tbe video time stamps are read out 
.rem tbe video time stamp buffer into tbe video time stamp 
processing module 75. wbicb controls tbe timing of tbe 

•4.« i-y\^ video stream S5V 
decoding of tbe video access units « tbe vide 

^ tbe Video decoder 65. Tbe audio time stamps «TX from 
tbe demultiplexer are fed into tbe audio time stamp buffer 
„ Tbe audio time stamps are read out from tbe audio time 
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Stamp buffer into the audio time stamp processing module 
76, which controls the timing of the decoding of the audio 
access units in the audio stream S5A by the audio decoder 
66. 

5 The video decoder 65 removes each access unit, i.e./ 

picture, of the video stream from the video input buffer 
62 for decoding in the order in which the access unit was 
received by the video input buffer. The audio decoder 66 
removes each access unit of the audio stream from the 
10 audio input buffer 63 for decoding in the order in which 

the access ixnit was received by the audio input buffer. 

The operation of the encoding and decoding system lOA 
described aJDOve will now be described. 

If still pictures are encoded, the KPEG 2 standard 
15 rec[uires that: 

- each still picture have an associated time stamp 
that determines how long the picture will be displayed; 

- each still picture be displayed for at least 2 
picture periods. Conse<iuently, the maximxim still picture 

20 rate is, e.g. 25 Hz/2 = 12.50 Hz for PAIi display devices, 

and 29.97 Hz/2 = 14.99 Hz for NTSC display devices; and 
- still picture video consist only of I-pictures. 
Consequently, decoders receiving the bit stream from 
the encoder must buffer and use all video time stamps to 
25 reconstruct a still picture video bit stream with the 
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correct ti-ing. xn an actual decoding system according to 
the invention, a separate video ti^e sta^ buffer is used 
for this purpose. To allow relatively small time stamp 
buffers to be used for this purpose and to guarantee that 
such tl»e stamp buffers will never overflow, the system 
target decoder according to the invention also includes a 
video time stamp buffer (or a functionally-equivalent 
parameter constraint) which affects certain parameters of 

■ the encoding syst^. 

using the arrangement shown in Figure 22B, the 
one-pass encoder shown in Figure 22A can configure itself 
to comply with the constraints of this model in addition 
to being capable of configuring itself to encode a normal 
full-motion video signal. 

Referring to Figures 22A and 22B, to comply with the 
STD video time stamp buffer constraint, the encoder lA 
first determines, at block 351, the STD video stream 
buffering delay that will prevent the STD video time stamp 
buffer 52 from overflowing. This value will be called 
20 DELAY THAT WORKS. 
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DEIiAY THAT WORKS = 

Size of STD time stamp buffer 52/time stamp coding 
frequency. 

in a system with a relatively low video bit rate 
(e.g., in many still picture applications), a buffering 
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delay longer than the value of DEIAY THAT WORKS is 
necessary for optimxim picture quality. Therefore, in such 
a system, the time stamp coding frequency is reduced as 
much as possible (as is allowed for still-picture video by 
the MPEG-2 standard) . Using locked encoding systems helps 
achieve this goal. Alternatively, the size of the video 
time stamp buffer 52 in the system target decoder may be 
increased to provide a longer delay. As a further 
a-lternative, both the time stamp coding freciuency may be 
reduced and the STD video time stamp buffer size may be 
increased* 

For example, for still picture video using, e.g., a 50 
Hz display device, the encoder will calculate the time 
stamp coding frequency tscf using the formula: 



(N is a positive integer) 
Since the MPEG- 2 standard requires that one time stamp 
be provided for each still picture, when used for 
generating a bit stream representing still picture video, 
the video encoder 201A will also generate I-pictures at a 
reduced rate, i.e., at the rate of 12.5/N Hz, if the time 
stamp coding frequency is reduced. The value of N is set 
by the encoder operator. 

Block 353 determines the video stream buffering 
delay that is needed to generate the worst case (i.e., the 
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tscf = 12.5/N 
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.ibla) picture using «i- °^ "'^^ 
largest possible; pi«-i.u 

• m -Ko railed DELAY FOR BIG 
input buffer 42. This value wxH be called 

PICTURE. 

DELAY FOR BIG PICTURE = 

si.e Of STD video input buffer 42 /bit rate of the 

video stream. 

xn practice, to »a.e t.e video .it stream "safe- for 
aii aecoaexs. t.e encode, i. .a. use a vaiue e^iXer t.an 

buffer 42 in tie aiove formula. 

value =£ DEIAY FOR BIG PICTUia can easily be 
..n,er tban one second in systems in wbicb tbe video bit 
rate is relatively low. 

T^T?T &v FOR BIG PICTURE with DELAY 
Block 357 compares DELAY FOR Bi^ 

WOKKS to determine tbe value of tbe selected decoder 
,ldeo buffering delay (S.^CT.C V B^^KRIKO PE^y) . « 

^r:.r,.^ -pwAT WORKS, the encodiug 

DELAY FOR BIG PICTURE =< DELAY THAT WORK 

^ ^ SELECTED V BUFFERING DELAY 

system will set the value of SELECTEij 

to DELAY FOR BIG PICTURE. 

xn some applications, DELAY FOR BIG PICTURE will be 

STD constraints, the encoder will set the value of 
SKLECTED V BUFFERING DELAY = DELAY THAT WORKS. 

value of SELECTED V BUFFERING DELAY is fed to the 
25 adder 319 and to block 363. 
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Block 359 calculates tlie memory quantity video output 
t)uffer size rec[uired for the video output buffer 300. The 
memory Quantity video output buffer size is calculated 
using the SELECTED V BUFFERING DELAY and the available 
5 video bit rate as follows : 

video output buffer size (bytes) = 

SELECTED V BUFFERING DELAY * available video bit 
rate /8. 

Block 359 feeds the value of video output buffer size 
10 to the video output buffer 300. 

Block 361 calculates the memory oruantity video time 
stamp-buffer size required for the video time stamp buffer 
301, The memory cjuantity required is that which will hold 
the niimber of presentation time stamps (PTS) and decoding 
15 time stamps (DTS) given by: 

SELECTED V BUFFERING DELAY * time stamp coding 
frequency . 

" Block 361 feeds the value of video time stamp buffer 
size to the video time stamp buffer 301. 

20 At blocks 363, 365 and 367, the encoder calculates the 

audio encoder buffering delay (from which the audio output 
buffer size and the audio time stamp buffer size are 
calculated) from the total video delay and the audio 
decoder buffering delay. To achieve end-to-end 

25 synchronization between audio and video, the end-to-end 
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t^e encoder and the decoder .nust be e,ual, as Bho«n in 
Figure 24B. 

Figure 24A showB the components of the end-to-end 
S system deiay total video deXav o. the video stream, which 
is calculated in blocX 363. This delay is called the 
total video delay. 

total video delay = 

SEI,ECTED V BOTFERIHO DEIAY . SELECTED V PE01»ERIHG 

-j_0 DELAY. 

value o£ the SELECTED V KEOimERINC DEI*T (SVBD) . 
„>^ch also affects picture ^ality, is usually one or more 
picture periods. The SELECTED V BEOEDEKIHG DELAT is the 
.um of tvo components, namely, the video encoder 
reordering delay (verd, and the video decoder reordering 
aelay (vdrd) . m this example, verd is assumed to he 
zero, and vdrd is set to one picture period, 
consequently, the SELECTED V BEORDEKINO DELKT is one 

• picture period. 

The SELECTED V BO.FERXNO DELXV is also the sum of two 
components, namely, the video encoder buffering delay 
(.ebd) and the video decoder buffering delay (vdbd) . 

^e value of total video delay calculated by the blocK 
363 is fed to the adders 321 and 323, and to the bloc. 367. 
,5 The audio input buffer 43 of the system target 
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decoder 4A is relatively small, and the audio decoder 46 
removes the audio stream from the audio input buffer, at a 
relatively constant rate. Furthermore, the audio access 
units are not reordered. Block 3 65 calculates the audio 
5 decoder buffering delay (adlDd) of the audio stream in the 

STD as follows : 

audio decoder buffering delay = 

size of STD audio input buffer 43 /audio bit rate. 
In practice, to make the audio bit stream "safe" for 

10 all decoders, the encoder may use a value smaller than the 

actual size of the system target decoder audio input 
buffer 43 in the above formula. 

The audio decoder buffering delay is small compared 
with the total video delay. As a result, the audio 

15 decoder buffering delay (adbd) calculated by block 365 is 

usually relatively short. To provide the required 
end-to-end synchronization between audio and video, it is 
not usually possible to reduce the total video delay 
because of picture cjuality recjuirements • Therefore, the 

20 block 3 67 calculates from the total video delay and the 

audio decoder buffering delay a value of the audio encoder 
buffer delay (aebd) that is sufficiently large to make the 
total audio delay match the total video delay, as shown in 
Figure 24B. 

25 To provide the audio encoder buffering delay aebd 
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^ni-r»ut buffer: Size 
output buffer 302 as follows. aud.o output 

,^^es) . audio encoder buffering delay 

* audio bit rate /8 
^ . ^>,. value of audio output buffer 

THe block 369 feeds tbe value o 

size to tbe audio output buffer 302 . 

Bloc. 3.1 calculates tbe.e.or..autit. audio t.e 

• _p buffer si.eCinti.esta.ps> retired for t.e audio 

**:oT- 503 as follows: 
10 time stamp buffer 30 J as 

^ e*^^ aize (time stamps) = 
audio time stamp buffer sxze (tim 

audio encoder buffering delay 

* audio access unit rate. 

^f f er si.e to t.e audio t^sta^ buffer 303. 

^ encoder eet up procedure ^a deacr^d «t. 

.eference to a lo« bit-rate application. . si^Uar 
,„cedure can be used to set up t.e encoder 1. for nor^l 

,ull-»otion video, or for applications . euc. as 

,0 professional video applications, in «bic. a very s.ort 

■ a,lav (e.a.. about 0.2 b) is required, 
buffering delay (e.g.. 

Ficure after the encoder has 

Returning now to Pisure /li 

^ -.^^n-hed, and has used 
calculated tbe parameters 3ust described, 

. to set up tbe video output buffer 300, 
these parameters to set up 

buffer 301, tbe audio output buffer 
25 tbe video time stamp buffer 
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302, tlie audio time stamp buffer 303 and tlxe adders 319, 
321, and 323, the encoder operates with these parameters 
to encode the video input signal S2 and the audio input 
signal S3 as follows. The video encoder 2 01A and the 
audio encoder 202A start encoding their respective input 
signals at the same time. Once the encoding process has 
started, and until the end of the respective input signals 
S2 and S3, the video encoder 201A will generate video 
access units at the selected picture rate and feed them to 
the video output buffer 300, and the audio encoder 202A 
will generate audio access units (AAU) depending on the 
selected audio sampling rate and number of samples per 
AAU, and feed them to the audio output buffer 302. The 
video encoder 2 01A includes a rate control mechanism 
(indicated by the path connecting the video output buffer 
and the video encoder) that prevents overflow of the video 
output buffer 300. By preventing overflow of the video 
output buffer having a size set according to the value of 
video-output buffer size, as described above, the video 
encoder 2 01A executes one of the tasks necessary to make 
the multiplexed bit stream SlA compliant with the 
constraints imposed by the system target decoder 4A. 

During the encoding process, the 3 3 -bit clock 
reference signal from the clock counter 307 continuously 
increments at the rate of 90 kHz in an MPEG-1 system, or at 
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„ «H, in .n avste». AXao. in an MPK.-. ayate.. ... 

time a.amp clocX reference signal incrementa at t.e 
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rate of 90 kHz. 

Ka=. time the ^e.inning of an acceaa unit of the viaeo 
input aignal S2 arrives at the video encoder 201A, the 
Video PTS generator 31i determines the value of the time 
sta^ cloOc reference signal from the clodc counter 30, as 
a video presentation time stamp (PTSl . The video PTS 
generator feeds the PTS to the time stamp re-ordering 
huffer 304. Where it is temporarily stored. The «S is 
associated with the address of the picture header of the 
corresponding video access unit in the re-ordering buffer 
hy, tor example, a pointer received from the video 
encoder. It, in encoding the video input signal, the 
video encoder reorders a video access unit of the video 
input aignal S2, the video encoder feeds the re-order flag 
to the time-stamp reordering buffer. Xn response to the 
..-order flag, the time stamp re-ordering buffer re-crdera 
the PTS belonging to that access unit. Xn other words, 
the time-stamp re-ordering buffer re-ordera the PTSa so 
that their order at the output of the time stamp 

access units at the output of the video encoder 201X. The 
time-stamp re-ordering circuit feeds the video 
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301. 

^ Each, time the video encoder 2 01A feeds an access unit 
of the video stream into the video output buffer 300, the 
video DTS generator 309 determines the value of the time 
5 stamp clock reference signal from the clock counter 307 as 

the video decoding time stamp (video DTS) of that vidiso 
access unit • The video DTS generator feeds the video DTS 
to the video time stamp buffer 30 1, where it is stored 
together with the PTS from the time-stamp re-ordering 

10 buffer 304, Together with the video time stamps, the 

video output buffer also receives from the video encoder 
201A and stores a pointer that indicates the address in 
the video output buffer 300 of the picture header of the 
video access unit to which the time stamps belong. 

15 Each time the audio encoder 202A feeds an access unit 

of the audio stream into the audio output buffer 302, the 
audio PTS generator 313 determines the value of the time 
stamp clock reference signal from the clock counter 307 as 
the audio presentation time stamp (audio PTS) of that 

20 audio access unit. The audio PTS is stored in the audio 

output buffer 303, together with a pointer indicating the 
address in the audio output buffer 3 02 of the header of 
the access unit to which the audio time stamp belongs. 

To generate the correct time stamp values, except f©r 

25 the picture reordering delay, the video encoder 201A ajid 
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..e auaio encoder 20.. ..eoxe.icaixy proauce access uni" 
instantaneously, and without delay. Conse^ently. for 
c'ertain pictures, t.e video PTS and t.e video BTS stored 
in the ti^e sta^P buffer will have, exactly the sa^e 
S values. Because real hardware implementations operate 

with delays, these delays must he ta.en into account when 
.he time sta^s are generated. For example, the time 
stamp generators 309. 311 and 313 can provide time stamp 
values that are additionally incremented to ta.e account 
10 of real processing delays. 

When the beginning of the video stream enters the 
video output buffer 300, the header generator 204 
generates a header, which it feeds to the multiplexer 
,„3X. The header generator receives the doc. reference 
signal from the cloc. counter 307. and includes in the 
oloc. reference field of the header the value of the cloc. 

stream entered the video output buffer. 

next, the header generator 204 generates the video 
paclcet header for the first video pacXet of the 

„ the multiplexer 203.. The video paCet header includes 
a length field, the value of which depends on the nu^r 

^ -Fniiow the video packet 

of bytes of video stream tHat wxll follow tn 

l^eader . The video paclcet length depends on the 
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application/ and on the multiplexing strategy. 

If the video packet includes an access unit header, 
the video packet header may also include a time stamp. 
Whether the video packet header is to include a time stamp 
5 can be determined by checking the video stream to be 

inserted in the video packet (which depends on the current 
read pointer to the video output buffer 3 00 and the video 
packet length) and by checking whether the pointer stored 
in video time stamp buffer 301 points to this segment of 

10 the video stream. Also, the multiplexer performs 

processing that emulates tracking the state of occupancy 
of the video time stamp buffer 52 in the system target 
decoder. If adding a time stamp to the video packet 
header would cause the video time stamp buffer to 

15 overflow, the multiplexer will not add a time stamp. On 

the other hand, if the video time stamp buffer is close to 
empty, the multiplexer may begin a new video packet so 
that a time stamp can be added to the multiplexed bit 
stream. In the manner just described, the multiplexer 

2 0 prevents the video time stamp buffer from overflowing or 

underf lowing . Similar processing is carried out to 
prevent the audio time stamp buffer 53 from overflowing or 
under flowing . 

The decoding time stamps and presentation time stamps 
25 are respectively fed from the video time stamp buffer 301 
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i„o .ue multiplexer via tbe adders 3X9 and 321. T..e 

adder 321 increments each presentation stamp W the value 
o't the total video delay calculated by the total video 
aelay calculation circuit 363 as described above, and the 
adder 319 increments each decoding ti^e stamp by the 
SB^CTEB V BO.™ calculated by the SELECTED V 

Bn^FERING DEI*Y calculating circuit 357 as described 
above. X£ the incremented PTS and the incremented BTS 
have different values, the multiplexer 203X will insert 
both of them into the video packet header. If the 
incremented PTS and the incremented DTS have the same 
value (i.e.. When the picture is a B-pi=ture, only one 
time stamp is inserted into the video pacXet header. 

«hen the video input signal S2 is a full-motion video 
signal, the multiplexer 203A will read the video stream 

insert it into the multiplexed bit stream SIA after 
completing the video packet header. «hile the video 
. stream is being read from the video output buffer 300. the 
read pointer to the video output buffer 300 is compared 
with the Oldest pointer in the time stamp buffer 301 that^ 
points to the address of one of the picture headers stored 
in the Video output buffer 300. When these pointers are. 
e^al, the PTS. BTS and associated pointer will be removed 
.rom the video time stamp buffer 301. This happens when 
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the video packet includes more tiian one picture header. 
When the video input signal S2 is an MPEG-style still 
picture video signal, because each picture must have an 
associated time stamp, the encoder will insert a new video 
5 packet header including time stamps just before each 

picture header. 

The encoder will reduce the size of a video packet 
and/or stop inserting new video packets into the 
multiplexed bit stream for a number of reasons, including: 
10 1-. to insert an audio packet into the multiplexed bit 

stream; 

2. the video output buffer 300 is empty; or 

3. there is no more video stream. 

Case 1 occurs at regular intervals that are shorter 
15 than the audio decoder buffer delay adbd. The first audio 

packet will not be inserted into the multiplexed bit 
stream until the audio encoder buffer delay time aebd has 
elapsed. However, dummy audio packets (or other useful 
information included in packets with the same size as 
2 0 audio packets) may be inserted into the multiplexed bit 

stream instead of audio packets before this time has 
elapsed. This maintains the video bit rate at the 
intended video bit rate, and prevents a temporary increase 
in the video bit rate that may violate the STD buffering 
25 constraints. 
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After the audio encoder buffer delay time aebd has 
elapsed, an actual audio packet is generated, and tHe. 
hlader generator 204 will generate an audio packet header. 
If the audio packet includes an audio access unit header, 
5 the audio time stamp buffer 303 will feed the oldest audio 
PTS stored therein to the multiplexer 203A, and the 
multiplexer will include the PTS in the audio packet 
header. The audio PTS is fed via the adder 323, which 
increments the oldest audio PTS by the value the total 
video delay calculated by the total video delay 
calculating circuit 363. as described above. 

AS the multiplexer 203A transfers the audio streaon 
from the audio output buffer 302 to the multiplexed bit 
stream SIA, the audio time stamp buffer 303 will discard 
those time stamps whose pointers point to addresses in the 
audio output buffer ecjual to the read pointer of the audio 

output buffer 302. 

Audio packets will continue to be generated until all 
. the audio stream generated by the audio encoder 202A from 
the audio input signal has been inserted into the 
multiplexed bit stream SIA. If, after this, any other 
elementary stream data needs to be transmitted, this 
stream data can be inserted into the multiplexed bit 
stream SIA. Otherwise, dummy packets are again inserted 
into the multiplexed bit stream SIA at regular intervals 
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instead of actual audio packets in order to maintain the 
intended video bit rate. 

Concerning case 2, in constant bit rate systems, tiie 
video encoder 2 01A monitors the occupancy of video output 
5 buffer 300, and can usually prevent the video output 

buffer 3 00 from becoming empty. The video encoder can 
generate additional video stream to refill the video 
output buffer by reducing the video compression ratio when 
the video output buffer approaches empty. If, despite 

10 such measures, the video output buffer 300 does become 

empty, the multiplexer 203A can include other useful 
information in the multiplexed bit stream SIA instead of 
the video stream. If such useful information is not 
available, the multiplexer can include stuffing bits in 

15 the multiplexed bit stream to maintain the target bit 

rate. 

In a variable bit rate system, the multiplexer 2 03A 
can simply wait until it is time to write an audio packet 
* or, if it is too early to write an audio packet, it can 
20 wait until a new video access iinit enters the video output 

buffer 300. This can then lead to generation of a new 
video packet. 

Case 3 occurs when all the video input signal S2 has 
been converted into the multiplexed bit stream SlA. The 
25 encoder may continue to generate other packets if data 
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stream, for BUC.> pa=l.e.s ar. still to be ineertea in SIA. 

figure 25 illustrates the operation of the decoder 6A 
Jith a low bit rate multiplexed bit stream. The low bit 
rate multiplexed stream shown in Figure 25 does not comply 
5 with the HPEG-2 still picture video re<rxirements set forth 
above. The MPEG standard provides a multiplexed bit 
stream including a video stream with a picture rate that 
is an integral fraction of the normal picture rate of 
about 25 or 30 frames per second (the highest picture rate 
,0 allowed is one half of the normal picture rate> . The HPEG 
standard leaves it to the decoder to perform non-Btandard 
processing to derive from the multiplexed bit stream a 
video signal with the normal pictures rate for feeding to 
a display device that re,nires a video signal with a 
XS normal picture rate. The decoder does this by reading out 
each of the decoded pictures stored in its output buffer 
several times at the normal picture rate. The additional 
processing re<^ired to decode the video stream with the 
below-normal picture rate Increases the complexity and 

20 cost of the decoder. 

^tlonal complexity In the decoder can be avoided by 
providing to the decoder a still picture video stream 
having a normal picture rate. An uncompressed still 
picture video signal consists of consecutive pictures at 
25 the normal picture rate. Consecutive pictures are 
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identical except at the points in the video signal at 
which the picture changes. Such a signal is encoded by 
coding the first picture after a picture change as an 
I-picture. All the other pictures in the video signal are 
5 also coded, but as minimal P- pictures. The video stream 

resulting from each of such pictures is little more than 
headers, and req:uires only a few hundred bits. 
Consecjuently, low bit-rate still picture video can be 
provided using a video stream that has a normal picture 
10 rate with only a slight reduction in the number of bits 

available to code the first picture after each picture 
change . 

The structure of the multiplexed bit stream S5A 
received by the decoder 6A from the medixim 5 is shown 

15 across the top of Figrure 25. The video stream consists of 

plural pictures at the standard picture rate, i.e., 25 or 
30 frames per second. The pictures are grouped into 
groups of pictures (GOP), each of which begins with the 
first picture following a picture change (an I-picture), 

2 0 followed by a number of P-pictures. The number of 

P-pictures corresponds to the number of normal picture 
periods between each picture change in the still picture 
video signal, in the example shown, to nine picture 
periods. The GOPs are included in the video stream so 

25 that each GOP is preceded by a video packet header 
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including tiJne stamps, 

.le .it inae. of t.e ,ideo input .uf .er .n.. in t.e 

Stamp buffer 72. 

„ ^ ^.innin, of viaeo .«e^. t.e ti.e =t=.P 

^ -Kaa ijeen demultiplexed, 

once tHe video packet header Has i>ee 

^ i-^^e first picture accumulates in the 
the video stream of the first p^-c 

,.aeo input .u..er at a su.«antiaiXV constant rate <t.e 
...eruptions in t.e .iow t.at occu. eac. ti.e an au.io^ 
pacet is .e. into t.e audic input ^«et .3 an. ea=. t^e 
. Video pacet .eade. is ee»ultiplexea ^ve ^en omitted 
,ot cxaritv) . video stream is contained in severaX 

.e^la. intervals in t.e »ultipiexed .it st«». and t.e 
„<.ire^nt t^t a ti.e sta»p («.ic. retires a video 
pacet .eader, .e included in t.e video sttea. at least 

..procure to ac=u-.late in t.e video input .uffer «. 
^en. after t.e video stream of t.e first X-pi=ture .as 

^ tlie video streams 

^en stored in t.e video input Imffer. 

, ^ . are fed into tne 

P-Pi=tures following the I-p.=ture 
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video input i^uffer. 

When tlie picture header of the first picture in the 
video stream following the video packet header including a 
time stamp is written into the video input buffer, a 
5 pointer to the address of the picture header is written in 
a table in the video time stamp buffer 62. 

During accumulation of the video stream in the video 
input buffer 62, additional time stamps accumulate in the 
video time stamp buffer 72, as shown in the lower bit 
10 index curve. These time stamps do not cause the video 

time stamp buffer to overflow because the encoder 
controlled the addition of time stamps to the video stream 
in consideration of the occupancy of the video time stamp 
buffer* 

15 After the initial buffering delay/ which allows 

sufficient video stream to accvxmulate in the video input 
buffer 62, the video stream of the first I-picture is 
removed from the video input buffer. In the exemple 
• shown, the initial buffering delay is four seconds. Once 

2 0 the initial buffering delay is over, the video decoder 65 

removes access units of the video stream from the video 
input buffer at the normal picture rate. During removal 
of these video streams from the video input buffer, the 
bit index shown in the Figure changes imperceptibly due to 

25 the small size of these pictures. The video decoder also 
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^ >^,,^-Fer using tlae read 
ol.ecks the taile in v:.deo .nput buffer u 

ae'coaer can de«x.ine ».et.e. t.e picture .as a ti.e sta»P 
(in stiXl Picture video, all the X-pictures will a tl^e 
..a.p, ^ut not all the .-picturea will ^ave a ti»e sta^. 

£U11 motion video, not all pictures will have a ti.e 
sta»P Since the ti-e ata^P buffer has insufficient si.e to 
acco-oaate a ti^e stamp for every picture) . Xf the 
Picture has a time stamp, the time stamp for the picture 
will be removed from the video input buffer, and will be 
used to determine the decoding time of the picture, xf 
the Picture lac^s a time stamp, the decoding time will be 
■ determined by the decoder cloclc. The resulting decoded 
pictures are fed to the decoder output at the normal 
„ picture rate to provide the still Picture display. 

in phase-locked systems, time stamps are only reouired 
„ set the start up delays of the audio decoder and the 
video decoder. Because the decoders are loclced to a 
<,^n reference, there is no need to use the time stamps 
,0 to maintain synchronism between the video decoder and the 
.udio decoder. Xn such a system, the first audio time 
stamp and the first video time stamp are respectively used 

^ -1 ^-^A f-ho video start up 
to set tue audio start up delay and tbe v:La 

delay. All other time stamps are ignored. 

xn such a system according to the invention, the 
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system target decoder is defined as follows. Tlie time 
s^tamp buffers 52 and 53 have a capacity of one time stamp. 
Operation of the video decoder 55 is defined so that it 
removes a time stamp from the video time stamp buffer only 
at the beginning of the multiplexed bit stream and at no 
other time- Operation of the audio decoder 56 is defined 
so that it removes a time stamp from the audio time stamp 
buffer 53 only at the beginning of the multiplexed bit 
stream and at no other time. The video decoder 55 and the 
audio decoder 56 are locked to a common clock reference. 

with such a system target decoder, the encoder will 
add the first video time stamp generated and the first 
audio time stamp generated to the multiplexed bit stream. 
Since the decoder removes these time stamps from the time 
15 stamp buffers. Since the STD will req:uire no more time 

stamps, the encoder adds no more time stamps to the 
multiplexed bit stream. This gives the possibility to 
eliminate the time stamp fields from the packet headers, 
allowing the bits saved to be used for other purposes. 
20 The invention has been described with respect to a 

system in which both audio and video streams are included 
in the multiplexed bit stream. However, the invention can 
be applied ecjually well to systems in which either an 
audio stream or a video stream is included in the 
25 multiplexed bit stream without the other. The invention 
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c.n al== »e applied « streams resuming £r=m =o»pres=in3 
other types of iaformatlon signal. The inventioh has also 
blen aescribed with respect to the KPEG-1 ahd MPEG-2 
standards, but the invention can be applied e^ally well 
to infor^tion streams and bit streains that do not comply 
with the MPEG standards. 
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CIiAIMS 

1. A method of generating a bit stream by 
multiplexing non-compressed aixxiliary information with an 
5 information stream, the information stream being obtained 

by compressing fixed- size units of an information signal 
with a varying compression ratio to provide varying- sized 
units of the information stream, the auxiliary information 
being for use in sxibsecjuently decoding the information 
10 stream, units of the auxiliary information corresponding 

to the units of the information signal, the method 
comprising the steps of: 

dividing the information stream in time into 
information stream portions; 
15 dividing the non-compressed auxiliary information 

in time into aiixiliary information portions; 

interleaving the information stream portions and 
the auxiliary information portions to provide the bit 
stream; and 

20 controlling the information stream dividing, 

aiixiliary information dividing, and interleaving steps by 
emulating decoding of the bit stream by a hypothetical 
system target decoder including a demultiplexing means for 
demultiplexing the bit stream, a serial arrangement of an 

25 information stream buffer and an information stream 
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.a a serial arrangen^ent of an auxiliary 

...omation processor, 

information buffer and an aux.lxary 

' ^^ont- being connected to tie 

each serial arrangement bexng 

t^ne information stream divxdxng, 
demultiplexing means, the xnform 

«^ interleaving steps 
, ...iXlaxy in.ox^.icn .ivlding. ana 

„olXea BU=>. .^at ...e information B.re» buKer 
^.in, oo«rolled ^,„erneir*.r overflow nor 

ana tse auxiliary information buffer 

underflow. 

, .e«oa of Claim 1- 

„n.ro'll.n,..e information .ream .ivi.ln,.au.illar, 

information aiviain.. an. interleaving steps. 

,^e aemultiplexin, means receives t.e bit stream 

„e extracts therefrom t.e information stream an. 
.uxiliarv information for feeein, to tbe information 

• 1 • •! nformation buffer, 
« -h^-Efer and the auxiliary mtorma 
;l^5 Stream burrer 

respectively; 

™ -buffer Has first target 
tbe information stream buffer 

• .=«^ation buffer bas a second 
tbe auxiliary information o 

5 0 target size; 

^=^r,rier removes the 
the information stream decoder rem 

^ A r^ry Stream from the 
varying-si«^ units of tbe information 

► . tirst target timing; ana 
information stream buffer at a first 

^ . ^ «-roeessor removes the 
the auxiliary information process 

^- ori sized units of the auxiliary 
25 corresponding fixed-sized u 
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information from the auxiliary information buffer at a 
second target timing. 

3. The method of claim 2, wherein, in the 
interleaving step: 

5 the bit stream comprises plural layers; and 

the information stream portions and the auxiliary 
information portions are interleaved in the same one of 
the plural layers of the bit stream. 

4. The method of claim 3/ wherein the auxiliary 

10 information is directory information for the information 

stream. 

5. The method of claim 4, wherein the information 
stream includes plural access points, and each unit of the 
directory information relates to one of the access points • 

15 6. The method of claim 5, wherein: 

in the step of dividing the aiixiliary information 
into auxiliary information portions, the directory 
information is divided into a directory packet including a 
. number of units of directory information determined by the 
20 second target size; 

in the step of dividing the information stream 
into information stream portions, the information stream 
is divided a set of plural information packets, the set of 
plural information packets including a number of access 
25 points eciual to the number of units of directory 



OCID: <WO 94300 1 4MJL> 



10 



PCT/JP94/00942 

WO 94/30014 A 



118 

information in the directory packet; and 

in the interleaving step, the directory packet is 
interleaved adjacent the set of information packets. 
7. The method of claim 2, wherein, in the 

interleaving step: 

tue bit stream comprises plural layers, and 

t^e information stream portions are interleaved in 

a first layer of tl.e bit stream, and the auxiliary 

i •,1-oi-l.aved in a second layer of 
information portions are interleavea m 

the bit stream, different from the first layer. 

8 . The method of claim 7 , wherein the information 
stream comprises plural access units and the au:ciliary 
information is a set of time stamps for decoding the 
access units of the information stream. 
9. The method of claim 8, wherein: 

m the controlling step the auxiliary information 
buffer has an occupancy determined by the second target 
size, the auxiliary information fed from the 
. demultiplexing means, and the auxiliary information 
removed by the auxiliary information processor, 

the step Of dividing the information stream into 

^^«T,a divides the information stream 
information stream portions divides tn 

into plural information packets; 

the step of dividing the au^liary information 
into auxiliary information portions divides the set of 
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time stamps into time stamps; 

the step of interleaving tlie information stream 
additionally includes the step of providing an information 
packet header for each information packet; and 
5 in the step of interleaving the information stream 

portions and the auxiliary information portions/ a time 
stamp is included in the information packet header of ones 
of the information packets selected according to the 
occupancy of the auxiliary information buffer. 
10 10. The method of claim 8, wherein: 

in the controlling step the information stream 
buffer has a first target size, and the auxiliary 
information buffer has a second target size/ and the 
auxiliary information buffer has an occupancy determined 
15 by the second target size, 

the auxiliary information fed from the 
demultiplexer, and the auxiliary information removed by 
the auxiliary information processor; 

the step of dividing the information stream into 
20 information stream portions divides the information stream 

into plural information packets; 

the step of dividing the auxiliary information 
into auxiliary information portions divides the set of 
time stamps into time stamps; 
25 the step of interleaving the information stream 
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additionally includes the step of providing an information 
packet header for each information packet; and 

in the step of interleaving the information stream 
portions and the auxiliary information portions, time 
5 stamps are periodically included in the information packet 
header of the information packets at a time stamp buffer 

fre<iuency; and 

in the controlling step, at least one of the time 
stamp coding frequency and the second target size is 
10 controlled in such a manner that maximizes the occupancy 
of the information stream buffer without causing the 
information stream buffer to overflow. 
11. The method of claim 7, wherein: 

the information stream decoder is one of plural 
15 information stream decoders, the information stream 
decoders being phase locked; and 

the auxiliary information buffer has a size set to 
accommodate one and no more than one unit of the auxiliary 
information . 

20 12. An encoder for generating a bit stream, the 

encoder comprising: 

means for compressing fixed- size units of an 
information signal with a varying compression ratio to 
provide varying-sized units of an information stream; 
25 information stream dividing means for dividing the 
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information stresan in time into information stream 



portions; 



aioxiliary information dividing means for dividing 
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non-compressed auxiliary information in time into 
auxiliary information portions, the aiixiliary information 
being for use in sxabsecjuently decoding the information 
stream, units of the axixiliary information corresponding 



information stream portions and the auxiliary information 
portions to provide the bit stream, the multiplexing means 
including a control means for controlling the information 
stream dividing means and the auxiliary information 
dividing means by emulating decoding of the bit stream by 
a system target decoder including a demultiplexing means 
for demultiplexing the bit stream, a serial arrangement of 
an information stream buffer and an information stream 
decoder, and a serial arrangement of an auxiliary 
information buffer and an auxiliary information processor, 
each of the serial arrangements being connected to the 

demultiplexing means, the control means controlling the 

J- 

information stream dividing means and the auxiliary 
information dividing means such that the information 
stream buffer and the auxiliary information buffer nei'bher 
underflow nor overflow. 



to the units of the information signal; 



multiplexing means for seq:uentially arranging the 
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13. THe encoder of claim 12, wherein: 

the demultiplexing means receives the bit stream 
and extracts therefrom the information stream and the 
auxiliary information for feeding to the information 
5 stream buffer and the auxiliary information buffer, 

respectively; 

the information stream buffer has first target 

size; 

the auxiliary information buffer has a second 

10 target size; 

the information stream decoder removes the 
varying-sized units of the information stream from the 
information stream buffer at a first timing; and 

the auxiliary information processor removes the 
15 corresponding fixed-sized units of the auxiliary 

information from the auxiliary information buffer at a 
second target timing. 

14. The encoder of claim 12, wherein; 

the bit stream provided by the multiplexing means 

20 comprises plural layers; and 

the multiplexing means arranges the information 
stream portions and the auxiliary information portions in 
the same one of the plural layers of the bit stream. 
15. The system of claim 12, wherein: 
25 the bit stream provided by the multiplexing means 
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comprises plural layers; and 



10 



15 



20 



the multiplexing means arranges the time-divided 



portions of the information stream in a first layer of 
the bit stream and arranges the non-compressed auxiliary 
information in a second layer of the bit stream, different 
from the first layer. 

16, A system wherein an information signal is 
compressed for transfer, together with non-compressed 
auxiliary information, to a medium as a bit stream, and 
wherein the bit stream is transferred from the medium and 
is processed to recover the information signal by 
expansion, and to recover the auxiliary information, the 
axixiliary information being for use in recovering the 
information signal, the system comprising: 
an encoder comprising: 



means for compressing the information signal 
to provide an information stream, fixed- sized 
units of the information signal being compressed 
using a varying compression ratio to provide 
varying-sized units of the information stream, 
and 

multiplexing means for secjuentially arranging 
time-divided portions of the information stream 
and time-divided portions of the non-compressed 
auxiliary information to provide the bit stream 
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for transfer to the medixim, the multiplexing 
means including a control means for determining 
a division of the information stream and of the 
auxiliary information into the respective time- 
divided portions by emulating decoding of the 
bit stream by a system target decoder including 
a demultiplexer means for demultiplexing the bit 
stream, a serial arrangement of an information 
stream buffer and an information stream decoder, 
and a serial arrangement of an auxiliary 
information buffer and an auxiliary information 
processor, each of the serial arrangements being 
connected to the multiplexing means, the 
information stream buffer and the auxiliary 
information buffer each having a size; and 
a decoder comprising: 

demultiplexing means for extracting the 
information stream and the auxiliary 
information from the bit stream transferred from 
the medium, first input buffer means for 
receiving the auxiliary information from the 
demultiplexing means, 

the first input buffer means having a size 
of at least the size of the auxiliary 
25 information buffer. 
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means for removing a unit of tlie auxiliary 



information from the first input buffer means. 



10 



15 



20 



second input buffer means for receiving tlie 
information stream from the demultiplexing 
means , 

the second input buffer means having a size of 
at least the size of the information stream 
buffer/ and 

decoder means for removing one of the 
varying-sized units of the information stream 
from the second input buffer means and for 
expanding the removed unit of the information 
stream to recover the information signal. 



17. The system of claim 16, wherein the control means 
determines the division of the information stream and of 
the auxiliary information into the respective time-divided 
portions such that the bit stream, when subject to the 
emulated decoding by the system target decoder causes the 
-information stream buffer and the auxiliary information 
buffer neither to underflow nor overflow. 

18. The system of claim 16, wherein: 

the bit stream provided by the multiplexing means 
has plural layers; and 

the multiplexing means arranges the time-divided 
portions of the information stream and of the non- 
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compressed auxiliary information in tl.e same one of the 
plural layers of the bit stream. 

19. The system of claim 18, wherein the auxiliary 
information is directory information relating to the 

5 information stream. 

20. The system of claim 19, wherein the information 
stream includes plural access points, and each unit of the 
directory information relates to one of the access points. 

21. The system of claim 19, wherein the control means 
10 determines a division of the directory information into 

directory packets each including plural units of directory 
information, and determines a division of the information 
stream into sets of plural information stream packets, 
each set of plural information stream packets including a 
15 number of access points equal to the units of directory 

information; and 

the multiplexing means multiplexes each- directory 

packet adjacent the set of information stream packets 
including the access points whereto the directory 
20 information in the directory packet relates. 

22. The system of claim 16, wherein: 

the bit stream provided by the multiplexing means 

has plural layers; and 

the multiplexing means arranges the time-divided 
25 portions of the information stream in a first layer of the 
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bit stream and arranges the non- compressed auxiliairy 
information in a second layer of the bit stream, different 
from the first layer. 

23. The system of claim 22, wherein the information 
5 stream comprises plural access units and the auxiliary 

information is a set of time stamps for decoding the 
access units of the information stream. 

24. The system of claim 23, wherein: 

the auxiliary information buffer has an occupancy 
10 determined by the size of the auxiliary information 

buffer, the axixiliary information fed from the 
demultiplexer, and the auxiliary information removed by 
the auxiliary information processor; 
the control means is for: 
15 determining a division of the information stream 

into plural information packets and providing an 
information packet header for each information packet, 

determining a division of the set of time stamps 
Into time stamps; 
2 0 se<3uentially arranging the information stream 

packets and the auxiliary information portions, time 
stamps are periodically included in the information packet 
header of the information packets at a time stamp buffer 
frecjuency; and 

25 controlling at least one of the time stamp coding 
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f reouency and the size of tlxe auxiliary information buffer 
in such a manner that maximizes the occupancy of the 
information stream buffer without causing the information 
stream buffer to overflow. 

25. A method of deriving a bit stream from an 
information signal, the method comprising the steps of: 

compressing units of the information signal to 
provide units of an information stream, the units of the 
information stream including access points; 

deriving from the information stream pointers 
pointing the access points in the information stream; and 

multiplexing the information stream divided into 
information packets together with pointer packets to 
provide the bit stream such that a set of information 
packets containing plural consecutive access points is 
multiplexed adjacent a pointer packet containing the 
pointers pointing only to the plural consecutive access 



10 



15 



20 



25 



points . 

26. The method of claim 25, wherein: 

the multiplexing step multiplexes the information 
packets together with pointer packets containing dummy 
pointers prior to the deriving step; and 

the method additionally comprises the step of 
overwriting the dummy pointers with the pointers derived 
, in the deriving step, the pointers overwritten into each 
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pointer packet being the pointers pointing to the plural 
consecutive access points intmediately preceding the 
pointer packet in the bit streain, 

27. A method of deriving a bit streain from an 
5 information signal, the method comprising the steps of: 

providing an encoder including: means for 
compressing units of the information signal to provide 
units of an information stream, first buffer means, having 
a size, for buffering the units of the information stream, 
10 means for providing a time stamp when the first buffer 

means receives each access unit of the information stream, 

second buffer means, having a size, for buffering the 
time stamps, and multiplexing means for multiplexing the 
information stream from the first buffer means and the 
15 time stamps from the second buffer means to provide the 

bit stream; 

defining a hypothetical system target decoder, the 
hypothetical system target decoder including a 
demultiplexer means for demultiplexing the bit stream, a 
20 serial arrangement of an information stream buffer and an 

information stream decoder, and a serial arrangement of a 
time stamp buffer and a time stamp processor, each serial 
arrangement being connected to the demultiplexer; 

determining the size of the first buffer means and 
25 the size of the second buffer means by emulating decoding 
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of tlie bit stream using the hypothetical system target 

decoder; and 

en=oaxng the infonnation sigaal using the encoSer 
with the Size Of the first buffer means and the size of the 
second buffer means set to the respective sizes determined 
by the determining step. 

28. The method of claim 27, wherein: 

in the step of defining the system target decoder: 
the information stream buffer and the time stamp 
buffer each have a size, and the information stream 
decoder decodes the information stream in response to time 
stamps removed from the time buffer the time stamp 

processor; and 

in the determining step, the size of the first 
buffer means and the size of the second buffer means are 

determined from. 

29. The method of claim 28, wherein: 

in the encoder, the multiplexing means 
periodically includes time stamps in the bit stream at a 
time stamp coding frequency; 

the information stream has a bit rate; and 
in the determining step, a buffering delay is 
derived from the time stamp coding frequency and the bit 
rate, and the size of the information stream buffer and 
the size of the time stamp buffer are derived from the 



15 



20 



25 



1D:^_»*W"4A1JL» 



wo 94/30014 PCT/JP94/00942 

131 

buffering delay. 

30. A decoder for a bit stream obtained by 
multiplexing non- compressed auxiliary information with an 
information stream, the information stream being obtained 
5 by compressing fixed- size units of an information signal 

witb a varying compression ratio to provide varying- sized 
units of the information stream, the auxiliary information 
being for use in subseqcuently decoding the information 
stream, units of the auxiliary information corresponding 
10 to the units of the information signal, the decoder 

comprising: 

demultiplexing means for extracting the 
information stream and the auxiliary information from the 
bit stream; 

15 first input buffer means for receiving the 

auxiliary information from the demultiplexing means; 

means for removing a unit of the auxiliary 
information from the first input buffer means; 

second input buffer means for receiving the 
20 information stream from the demultiplexing means; and 

decoder means for removing one of the varying- 
Sized uiits of the information stream from the second 
input buffer means and expanding the removed unit of the 
information stream in response to the vmit of the 
25 auxiliary information to recover a fixed- size unit of the 
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information signal. 

31. The decoder of claim 30, wherein the decoder means 

removes the one of the varying sized units of the 
information stream from the second input buffer means at a 
time indicated by the unit of the auxiliary information. 
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